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1. Introduction 



Let H = (hij) be a complex Hermitian or real symmetric N x N random matrix with centred matrix entries 
that are independent up to the symmetry constraint. We assume that the variances sy := E|/iy| are 
normalized so that ^ - Sij = 1 for each i, and let ||s||oo : = maxy Sjj denote the maximal variance. Let 
G a b(z) ■= (H — z)~ b denote the resolvent matrix entries evaluated at a spectral parameter z = E + irj whose 
imaginary part rj is positive and small. It was established in plfTH] that 



A := max|G ab | < J 1^ (1.1) 
f I 

with high probability for large N, up to factors of A^ £ . 

The matrix entries G a b = G a b{z) depend strongly on the entries of the a-th and 6-th columns of H, 
but weakly on the other columns. Focusing on the dependence on a only, this can be seen from the simple 
expansion formula 

G a b — —G aa ^""^ hgjGfo , (1-2) 

where G^ denotes the reso lven t of the (N — 1) x (jV — 1) minor of H obtained by removing the a-th 



row and column (see Lemma 3.7 below for the general statement). Since G^ a ' is independent of the family 
(h a i)fL 1 , the formula (1.2) expresses G a b as a sum of independent centred random variables (neglecting 
the prefactor G aa which still depends on (h a i)f =l ). Therefore the size of G a b is governed by a fluctuation 
averaging mechanism, similar to the central limit theorem. This is the main reason why the bound ( jTTTj ) is 
substantially better than the naive estimate \G a b\ ^ Tj . 

In this paper we investigate a more subtle phenomenon. To take a simple example, we are interested in 
averages of resolvent matrix entries of the form 



(1.3) 



or, more generally, its weighted version 



^S^ a G a b, (1.4) 
a 

where [i and b are fixed. We aim to show that, with high probability, these averages are of order A 2 - much 



smaller than the naive bound A which results from an application of ( 1.1 ) to each summand (we shall always 



work in the regime where Ac 1). The mechanism behind this improved bound is that for a =/= a' the matrix 



entries G a b and G a >b are only weakly correlated. To see this, note that, since h ai in (1.2) and h a n in the 
analogous formula 

G a 'b = G a ' a ' ^ ^ ha'i'G^fo , 

are independent, the correlation between G a b and G a 'b primarily comes from correlations between h a i and 
G^ and between h a >i> and G^ . (As above, here we neglect the less important prefactors G aa and G a ' a '-) 

Now G^ depends only weakly on h al unless some lower indices coincide: i — i' or i — b or i' = a. Such 
coincidences are atypical, however, and consequently give rise to lower-order terms. Once the smallness 



of the correlation between G a b and G a 'b is established, the variance of the averages (1.3) or (1.4) can be 
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estimated. The smallness of the higher-order correlations between different resolvent matrix entries allows 
one to compute high moments and turn the variance bound into a high-probability bound. However, keeping 



track of all weak correlations among a large product of expressions of the form (1.3) with different a's is 



rather involved, and we shall need to develop a graphical representation to do this effectively. 

This idea of exploiting the weak dependence among different resolvent entries of random matrices first 
appeared in 17 and was subsequently used in [5 16 18. Such estimates provide optimal error bounds in 
the local semicircle law - a basic ingredient in establishing the universality of local statistics of for Wigner 
matrices. 

Our main result in this paper estimates with high probability (weighted) averages of general monomials 
in the resolvent matrix entries and their complex conjugates, where the averaging is performed on a subset 
of the indices. A more complicated example is 



^ Sf_iaSvb(\G a b\ 2 \G a p\ 2 — Eab|G a b| 2 |G ap 
a, b 



(1.5) 



where [a, v, and p are fixed. Here we subtract from each summand its partial expectation E a t, with respect 
to the random variables in the a-th and 6-th columns of H . (Note that we could also have subtracted E a G & 
in (1.3) and (1.4) as well, but this expectation turns out to be negligible, unlike the expectations of the 



manifestly positive quantity |G a f,| 2 |G ap | 2 in (IL5J)). 



The expression (1.5) can trivially be estimated by A with high probability using the estimate (1.1) on 



each summand (neglecting that diagonal resolvent matrix entries G aa require a different estimate). However, 
we can in fact do better: the averaging over two indices gives rise to a cancellation of fluctuations, due to the 
weak correlations among the summands. Since each averaging independently yields an extra factor A as in 
(1.3) and (1.4), it seems plausible that the naive estimate of order A 4 on (1.5) can be improved to A 6 . This 



in fact turns out to be correct in the example (1.5), but in general the principle that each averaging yields 



one extra A factor is not optimal. Depending on the structure of the monomial, the gain may be more than 
a single factor A per averaged index. For example, averaging in the index a in the quantities 



(I) : = ^2 s^a(G ba G* ab - E a G 6a G* b ) and (II) := ^ s fia (G ba G ab - E a G ba G ab ) 



(1.6) 



has different effects. The naive estimate using yields A 2 for both quantities, but (I) is in fact of order 
A 4 while the (II) is only of order A 3 (all estimates are understood with high probability). 

The reason behind the gain of a factor A 2 over the naive size in case of (I) is quite subtle. We already 
mentioned that the dependence of G a b on the random variables in the c-th column is weak if c ^= a, b. This 
is manifested in the identity 

G ac G c b 



G„ 



G 



(c) 



G c 



(1.7) 



(This identity first appeared in 15 ; see Lemma 3.7 below for a precise statement and related formulas.) Since 
G^j} is independent of the c-th column, the c-dependence of G a b is contained in the second term of ( |1.7[ ). 
This term is naively of order A 2 , i.e. smaller than the main term (accepting that G cc in the denominator 
is harmless; in fact it turns out to be bounded from above and below by universal positive constants). 



Computing the variance of (I) results in a double sum J2 a S c - We shall see that, since the first term of (1.7) 
is independent of c, the leading order contribution to the variance in fact comes from the second term. This 
yields an improvement of one A over the naive bound A 2 . These ideas lead to a bound of order A 3 for both 
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(I) and (II). The idea of using averaging to improve a trivial bound on resolvent entries by an extra factor 



A was central in 17 . In that paper this idea was applied to a specific quantity analogous to 



(1.8) 



When we compute a high moment of the quantities in (1.6), we successively use formulas (1.7) and (1.2) 



and take partial expectation in the expanded indices. The result is the average of a high-order monomial of 
resolvent matrix entries. Whether this averaging reduces the naive size depends on the precise structure of 
the monomial. For example, 



G bc \ 2 = 0(A 2 ) 



(1.9) 



and this estimate is optimal, while 



, SncGbcG 



cb 



0(A 3 ). 



(1.10) 



It turns out that average of the high-order monomial obtained from computing a high moment of (I) in 
(1.6) contains several summations of the type (1.10), while the analogous formula for (II) contains only 
summations of the type (1.9) (at least to leading order). Whether the additional gain is present or not 



depends on the precise structure of the original monomial, in particular on how many times the averaging 
index appears in an entry of G or G* . In this regard the expressions (I) and (II) differ, which is the reason 
why their sizes differ. Our main result (Theorem 4.8 ) expresses the precise relation between the maximal 



gain and the structure of the monomial. As it turns out, this dependence is quite subtle. The main purpose 
of this paper is to give a systematic rule, applicable to arbitrary monomials in the resolvent entries, which 
determines the gain from all indices over which an average is taken. In particular, averaging over certain 
indices yields an improvement of order A 2 ; this is a novel phenomenon. This observation is crucial in the 
application of our results to the problem of quantum diffusion in random band matrices 13]. 

Finally, we shortly explain the improvement from the naive size A 2 to A 3 for the left-hand side of ( |1.10 ) . 
It follows from the estimate of order A 3 on (II) in (1.6) and from the fact that E c G ac G c fc = 0(A 3 ) for any 
a, b. That the expectation E c G ac G c b itself is smaller than its naive size A 2 may be seen by expanding G ac G cb 
in the index c using formulas of the type ( 1.2 ). It turns out that E, c G ac G c b, viewed as a vector indexed by c 



and keeping a and b fixed, satisfies a stable self-consistent vector equation (see (7.16)). The analysis of this 
equation leads to the improved bound on E c G ac G c ;, of order A 3 . 

Bounds on averages of resolvents of random matrices have played an essential role in establishing the 
local semicircle law with an optimal error bound. We recall that in the simplest case of Wigner matrices, 
where = N -1 , the trace of the resolvent 



m N (z) := -^TrG(z) 



is well approximated by the Stieltjes transform of the celebrated Wigner semicircle law 

1 f 2 V4-z 2 
27T J_ 2 x - z 



m[z) 



Ax . 



The optimal bound is 



\m(z) - m N (z)\ 



< 



1 



(1.11) 
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with high probability (see 16 for the precise statement and the history of this result). One of the main 
steps in proving this optimal bound is to exploit that G aa and G a i a ' are only weakly correlated for a ^= a! . 
Hence the average of G aa in a in the definition of mjv(z) fluctuates on a smaller scale than the fluctuations 
of G aa . Various forms of this fluctuation averaging were formulated in [5 16 17 . They were the key inputs 
to prove (JlTTJ and its analogue for the sample covariance matrices in [18] . In Proposition 6.1 we present 
a simple special case of our main result, Theorem |4.8| This proposition yields generalizations of estimates 
analogous to the previous fluctuation averaging bounds with a more streamlined proof. A somewhat different 



simplification was given in 18 . 

On the one hand, Theorem |4.8| is more general than its predecessors since it is applicable to arbitrary 
monomials in G and G* , and also holds for universal Wigner matrices with nonconstant variances. On the 
other hand, and more importantly, Theorem |4.8| gives a stronger bound because it exploits the additional 

This 



cancellation effect explained in connection with the different bounds on the two quantities in (1.6 1 
extra cancellation mechanism was not present in |5 16 - 18 



In a separate paper [3] we apply the stronger bound 

5> Ma (|G ab | 2 -E a |G afc 



0(A 4 



(1.12) 



to derive a lower bound on the localization length of random band matrices, 
of [5|[l6 18 would have yielded only 



a (| G a fc 



0(A 3 ) . 



Extensions of the methods 



(1.13) 



Had we had only (1.13) available in [3], the resulting estimate on the localization length would not have 
improved the previously known results [l|[2j on eigenvector derealization. 

We conclude this section with a roadmap of the paper. In Section [2] we define our main objects and 

in Section |4j Before stating 
In order to motivate the 



introduce notation used throughout the paper. Our main result is Theorem |4. 8 



it in full generality, we first present a special case, Proposition 3.3 in Section 
concepts underlying Theorem |4.8[ we not only state this special case but also give a sketch of its proof, in 
Section [3. 2 1 This is done before the main theorem is stated. A reader who prefers an inductive presentation 
should follow our sections in sequential order. A reader who wants to jump quickly to the main result may 
However, some concepts introduced in Section 3.2 are needed later in the proof (but not 
The full proof of Theorem 4.8 is presented in Sections [6T[9j following 



skip Section 3.2 



in the statement) of Theorem 4.8 



Section [3] where we give an outline of the proof and explain how Sections |6}{9] are related 



2. Setup 



Let (hij : i ^ j) be a family of independent, complex-valued random variables hij = h\- satisfying E/iy = 

and hu € K for all i. For i > j we define hij ■■— hji, and denote by H = Hm = (/&ij)ij=i the N x N matrix 
with entries hij . By definition, H is Hermitian: H = H* . We abbreviate 

Btj := E|/i f, M = M N := . (2.1) 

maxj j Sij 
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In particular, we have the bound 

Sij < M' 1 (2.2) 

for all i and j. We introduce the N x N symmetric matrix S = Sn = (s,-j)|j =1 . We assume that S is 
(doubly) stochastic: 



■s 



*7 



= 1 (2.3) 



for all i. We shall always assume the bounds 

N s ^ M «S N (2.4) 

for some fixed 5 > 0. 

Example 2.1 (Band matrix). Fix d £ N. Let / be a bounded and symmetric (i.e. f(x) = f(—x)) probability 
density on R d . Let L and W be integers satisfying 

L s ' sC W sC L 

for some fixed S' > 0. Define the d-dimensional discrete torus 

t£ = [-i/2,L/2) d nZ rf . 

Thus, has N = L d lattice points; and we may identify with {1, . . . ,N}. We define the canonical 
representative of i € Z d through 

:= (i + LZ d )nTi. 

Then if is a d- dimensional band matrix with band width W and profile function / if 

3 Z L J \ W 

It is not hard to see that M = (W d + 0(W d ~ 1 )) /||/||oo asL^ oo. The rows and columns of H are thus 
indexed by the lattice points in T^, i.e. they are equipped with the geometry of Z d . For d = 1, assuming 
that / is compactly supported, the matrix entry hij vanishes if \i — j\ is larger than CW, i.e. H is a band 
matrix in the traditional sense. 

It is often convenient to use the normalized entries 

C - ■= (s )- 1/2 h - 

which satisfy IKQj — and E|£ij| 2 = 1. (If = we set for convenience Qj to be a normalized Gaussian, 
so that these relations remain valid. Of course in this case the law of Qj is immaterial.) We assume that 
the random variables Qj have finite moments, uniformly in N, i, and j, in the sense that for all p G N there 
is a constant fi p such that 

E|Ci,r < ii v (2.5) 
for all N, i, and j. We make this assumption to streamline notation in the statements of results such as 



Theorem 4.8 and the proofs. In fact, our results hold, with the same proof, provided (2.5) is valid for some 



large but fixed p. See Remark |4. 1 1 1 below for a more precise statement. 



G 



Throughout the following we use a spectral parameter z £ C satisfying lmz > 0. We shall use the 
notation 

z = E + i?7 

without further comment. The Stieltjes transform of Wigner's semicircle law is defined by 

m ee m(z) := - / d£ . (2.6) 

To avoid confusion, we remark that the Stieltjes transform m was denoted by m sc in the papers [5 -17 , in 
which to had a different meaning from (2.6 1. It is well known that the Stieltjes transform m satisfies the 
identity 

m(z) + — ^- +z = 0. (2.7) 

m(z) 

We define the resolvent of H through 

G(z) := (H-z)-\ 

and denote its entries by Gij(z). We also write G*(z) := (G(z))* = (H — z) _1 . We often drop the argument 
z and write G = G(z) as well as G* = G*(z). 

Definition 2.2 (Minors). For T C {1, . . . ,N} we define by 

(JfC^y := l(i i T)l(j g 7>y . 

Moreover, we define the resolvent of through 

G\j\z) - (H^-z)J. 

We also set 

(T) 

E : = E • 

When T = {a}, we abbreviate ({a}) by (a) in the above definitions; similarly, we write (ab) instead of 
({a,b})- 

Definition 2.3 (Partial expectation and independence). Let X = X(H) be a random variable. For i E 
{1, . . . , N} define the operations Pi and Qt through 

P t X := E(X\H (i) ), Q t X := X - P t X . 

We call Pi partial expectation in the index i. Moreover, we say that X is independent ofTd {1, . . . ,N} if 
X = PiX for all i e T. 

The following definition introduces a notion of a high-probability bound that is suited for our purposes. 

Definition 2.4 (Stochastic domination). Let X = (jfW(u) : N e N, u e J7 (w )) be a family of random 
variables, where [/"W is a possibly iV-dependent parameter set. Let ^ = (ty( N )(u) : N £N,u e U^ N ') be a 
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deterministic family satisfying ^'^(u) 0. We say that X is stochastically dominated by , uniformly in 
u, if for all (small) e > and (large) D > we have 



sup . 



(N), 



^ iV" 



-D 



for large enough N ^ Nq(e,D). Unless stated otherwise, throughout this paper the stochastic domination 
will always be uniform in all parameters apart from the parameter S in (2.4 1 and the sequence of constants 
Hp in (2.5); thus, No(e,D) also depends on <5 and [i v . If X is stochastically dominated by 'J, uniformly in u, 
we use the equivalent notations 

X -< * and X = O^(tf). 



For example, using Chebyshev's inequality and (2.5) one easily finds that 

hij ~< {s l3 ) 1/2 < AT 1 ' 2 , 



(2.8) 



so that we may also write hij = O^^Sij) 1 ^ 2 ). The relation -< satisfies the familiar algebraic rules of order 
relations. For instance if A Y < * 1 an d A 2 -< * 2 then A x + A 2 -< *i + ^ 2 and A ± A 2 -< x f , i* 2 - Moreover, if 
A -< * and there is a constant C > such that <3> > N~ c and \A\ ^ N c almost surely, then P t A -< * and 
QiA -< 'J. More general statements in this spirit are given in Lemma 3.6 below. 

Let 7 > be a fixed small positive constant and let (S^- 1 ) be a sequence of domains satisfying 

S (JV) C {zeC:-10^£C10, M~ 1+7 s$ 77 sC 10} . 

As usual, we shall systematically omit the index N on S. 

Definition 2.5. A positive TV-dependent deterministic function ^> = on S is called a control parameter. 
The control parameter \1/ is admissible if there is a constant c > such that 



M _1/2 s$ *(z) AT 



(2.9) 



for all N and z € S. 



In this paper we always consider families X^ N ^(u) = x[ N \z) indexed by u — (z,i), where z S S and i 
takes on values in some finite (possibly A-dependent or empty) index set. 

We slightly modify the definition (1.1) to include a control on the diagonal entries of G in addition to 
the off-diagonal entries. For the rest of the paper, we define the (z-dependent) random variable 

A(z) := max| G xy (z) - 8 xy m{z)\ . 



The variable A will play the role of a random control parameter. If ^ is an admissible control parameter, 

(2.10) 



the lower bound on ^> in (2.9) together with (2.8) imply that 
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3. Simple examples and ingredients of the proof 



In this section we give an informal overview of fluctuation averaging, by stating and sketching the proofs of 
a few simple, yet representative, cases. Our starting point will always be an admissible control parameter VE" 
that controls A, i.e. A -< "P. In addition to ^, we introduce the secondary control parameter 

$ = min{ (? (4' + AT 1 / 2 *" 1 ) , l} , (3.1) 

where we defined the coefficient^ 

q ■■= ||(1 -m 2 ^)- 1 |U^. (3.2) 

Thus, <!> is defined in terms of the primary control parameter 5', although we usually do not indicate this 
explicitly. 



Remark 3.1. We use the somewhat complicated definitions (3.1| and (3.2) because they emerge naturally 
from our argument, and do not require us to impose any further conditions on the matrix H or the spectral 
parameter z. The parameter <E> will describe the gain associated with a charged vertex or a chain vertex; see 
Definitions 14.71 and 15.11 below. 



In the motivating example of band matrices (Example 2.1), the parameter <E> may be considerably sim- 



plified. Indeed, in that case there is a positive constant C such that 



(lmm) z 



as proved in Proposition B.2 below. For most applications, we are interested in the bulk spectrum of the 



band matrix, i.e. E € [—2 + k, 2 — k] for some fixed k > 0. In that case the relation Imm(z) x y^rj + 2 — \E\ 



(proved e.g. in 17 Lemma 4.2]) yields Imm ^ c for some positive constant c depending on k. We conclude 



that 1 ^ g ^ ClogiV; the logarithmic factor in the upper bound is irrelevant, since $ will always be used as 



a deterministic control parameter in Definition 2.4 In summary: for the bulk spectrum of a band matrix, 
we may replace <E> with ^ + M -1 / 2 * -1 . 

Moreover, in typical applications the imaginary part rj of the spectral parameter z is small enough that 
W ^ AT 1 / 4 . In this case $ and ^ are comparable (in the bulk spectrum), and hence interchangeable as 
control parameters in Definition |2.4| 



Remark 3.2. We have the lower bound 

1/2 sC ll-m 2 !- 1 < q, (3.4) 



where the first inequality follows from (3.11) below, and the second from the identity (1 — m 2 S) 1 e = 
(1 — r7i 2 ) _1 e with the vector e := (1, . . . , 1). We therefore have the bounds VP ^ 2$ ^ 2. 

In this section we sketch the proof of the following result. 

PROPOSITION 3.3 (Simple examples). Suppose that A -< ^ for some admissible control parameter VP. Then 
we have 

1 O) , (p) 

a a 
1 Here we use the notation || A||.£oo_^oo = maxj ■ \Aij\ for the operator norm on -^^(C^). 
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as well as 



, (m) 1 O) 

^E^( G ^G Qp ) -< * 3 , _Y,Qa(G^G* atl ) -< * 3 $. 

a a 

In addition, we have the bounds 



(3.6) 



(3.7) 



Remark 3.4. As explained after (3.11, typically $ and VP are comparable. In this case the right-hand sides 
of the estimates in (3.5) can be replaced with 'J 3 and those of (3.6 1 with "J 3 and 'J 4 , and those of 
(3.7) with 'I' 3 and $ . Thus we may keep track of the improving effect of the average using a simple power 



counting in the single parameter replacing each $ with a 'J. 



The significance of Proposition 3.3 is the following. The trivial bound G a/J -< 'J (which follows immedi- 



ately from A -< ^) implies, for example, that j^X^i G^Ga^ -< The first estimate in (3.5) represents 



an improvement from ^> 2 to ^E' 2 $. This improvement is due to the averaging over the index a of fluctuating 
quantities with almost vanishing expectation. We shall refer to such vertices as charged; see Definition |4.7| 
below. In contrast, there is no such improvement in the second estimate of (3.5), since G^ a G^ — |G^ a | is 
always positive. If we subtract the expectation (for technical reasons, we subtract only the partial expecta- 
tion, i.e. take Q a = 1 — P a ), then the averaging becomes effective and it improves the average of G^aG*^ by 
two orders, from 'J 2 to \E' 3 $. Interestingly, subtracting the expectation in the average of G^ a G a ^ does not 



improve the estimate further; compare the first bounds in (3.5) and ( |3.6[ ). (In fact, we get the only slightly 
stronger bound "J 3 instead of ^ 2 $.) These examples indicate that the improving effect of the averaging 
heavily depends on the structure of the resolvent monomials. 

We shall be concerned with averages of more general expressions. Roughly, we consider arbitrary mono- 
mials in the resolvent entries (Gy). Some of the indices are summed. The summation is always performed 
with respect to a weight, a nonnegative quantity which sums to one. In the examples of Proposition |3.3[ the 
weight was A -1 . Generally, we want to allow weights consisting of factors N 
J2j s ij = J2j = 1- Thus, in addition to (3.5), (3.6), and (3.7) we have for example the bounds 



as well as s,,-; recall that 



-< f 2 $, 



E 



< * 3 $, ^2s va {G aa ~m) -< 



(3.8) 



A slightly more involved average is 



a . b 



SfiaSpb Qb{G lla G a bGi )u G a i ) G va j 



(3.9) 



where fi, v, and p are fixed. In Theorem |4.8| we shall see that (|3.9| is stochastically dominated by <f 6 $ 2 



This means that the double averaging and the effect of one Q-operation amounts to an improvement of a 
power three, from the trivial bound ^ 5 to , 3> 6 <I> 2 . It may be tempting to think that each average and each 
factor Q improves the trivial bound by one power of \1/ or but this naive rule already fails in the some 



of the simplest examples in (3.5) and (3.6). The relation between the averaging structure and the improved 



power of ^ and $ is more intricate. Our final goal (see Theorem 4.8) is to establish an optimal result for 
general monomials, which takes into account the precise effect of all averages. 
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More generally, we shall be interested in averaging arbitrary monomials Z a in the resolvent entries. Each 



such monomial contains a family of summation indices a and external indices fi. In the example (3.9), we 
have 

Z a = G^GabGl^G^G^a , a = (a, b) , (j, = (n, v) . (3.10) 

The most convenient way to define such a monomial Z a is using a graph. The vertices are associated with the 
summation and external indices, and a resolvent entry G xy is represented as a directed edge from vertex x 
to vertex y. We draw an edge associated with a resolvent entry G xy with a solid line, and an edge associated 
with a resolvent entry G* y with a dashed line. See Figure 



3.1 



As it turns out, the gain in powers of 
\E' resulting from the averaging has a simple expression in terms of such graphs. Moreover, this graphical 
representation is a key tool in our proofs. 





Figure 3.1. Graphs associated with the monomials (from left to right) G M<J G aM , G^aG^, and G^aGabGl^G^Gva 
(from pJO] ). 



Note that neither the Q-f&ctons nor the averaging weights are encoded in the graphical structure. Later 
we shall give a more precise definition of the class of weights we consider, but as an orientation to the reader, 
we emphasize that they play a secondary role. As long as the weights ensure an effective averaging over 
at least M values of each summation index, their final role is simply accounted for in the additional factor 
M -V 2 ^- 1 in the definition of $. The key improvement on the power of \& in the final estimate is solely 
determined by the structure of Z a and by the locations of the Q"fa c t,° rs - 

3.1. Preliminaries. In this subsection we collect some basic facts that will be used throughout the paper. 
We use C to denote a generic large positive constant, which may depend on some fixed parameters and 
whose value may change from one expression to the next. For two positive quantities Apj and Bjy we use 
the notation An x Bm to mean C~ 1 Am < Bm < CAp 



in- 



Lemma 3.5. There is a constant c > such that for E G [—10, 10] and r\ € (0, 10] 

c \m(z)\ sS 1. 



Proof. See Lemma 4.2 in 17 



(3.11) 

□ 



The following lemma collects basic algebraic properties of stochastic domination -<. 



Lemma 3.6. (i) Suppose that X(u,v) -< ^(u,v) uniformly in u 6 U and v € V. If \V\ ^ N c for some 
constant C then 
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uniformly in u. 

(ii) Suppose that X\{u) -< $>i(u) uniformly in u and X 2 (u) -< ^2(11) uniformly in u. Then 

X 1 (u)X 2 (u) -< *!(u)* 2 (u) 

uniformly in u. 

(Hi) Suppose that > N~ c for all u and that for allp there is a constant C p such that E\X (u)\ p < N Cp 
for all u. Then, provided that X(u) ~< <£(«) uniformly in u, we have 

P a X(u) ~< #(u) and Q a X(u) -< 

uniformly in u and a. 

Proof. The claims (i) and (ii) follow from a simple union bound. The claim (iii) follows from Chebyshev's 
inequality, using a high-moment estimate combined with Jensen's inequality for partial expectation. We 
omit the details. □ 

We shall frequently make use of Schur's well-known complement formula, which we write as 

j (Ti) 

= hu-z-J^hikG^hu, (3.12) 
Gu k,i 

where i <£ T C {1, . . . , N}. 

The following resolvent identities form the backbone of all of our calculations. The idea behind them 
is that a resolvent matrix entry Gij depends strongly on the i-th and j-th columns of H, but weakly on 
all other columns. The first set of identities (called Family A) determines how to make a resolvent matrix 
entry Gij independent of an additional index k =/= The second set of identities (Family B) expresses the 
dependence of a resolvent matrix entry Gij on the matrix entries in the i-th or in the j-th. column of H. 

Lemma 3.7 (Resolvent identities). For any Hermitian matrix H and T C {1,...,N} the following 
identities hold. 

Family A. For k ^ T and k 7^ we have 



f~t{T) _ r (Tk) ^ik ^kj 1 1 ^ifc °fci /o -I o\ 

' T r {T) ' r {T) r {Tk) r (T) r (Tk) „(T) ' ^- La > 

"fcfc "ii "ii "ii "ii "fcfc 



Family B. For i,j £ T satisfying i 7^ j we have 



(Ti) (Tj) 

= -GP £ = -G$> £ G^h k j (3.14a) 
k k 

G\T = G^G^ Uij + £ h ik G^h h ) , (3.14b) 
^ k.i ' 

' 1 -(-fc, + zf>+tf<™>), (3.14c) 



q( t ) rn 
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where we defined 



(Ti) 
k,l 



u. 



(S) 



(S) 



(S) 
kk 



(3.15) 



Proof. The first identity of (3.13) was proved in Lemma 4.2 of 15 . The second identity of (3.13 
immediate consequence of the first. The identities (3.14al were proved in Lemma 6.10 of (61, and ( 



is an 



3.14b) 



follows by iterating (3.14a) twice. Finally, (3.14c) (together with (3.15)) follows easily from (3.12), (2.7), the 
partition 1 = Qi + Pi, and the definition (2.1 ). □ 

Next, we record a simple estimate on resolvent entries of minors. For T C {1, . . . , N} define the random 
variable 

A (T) (z) := max G^Hz) - 5 H m(z) 

Lemma 3.8 (Bound on A' t '). Suppose that A -< ^ for some admissible control parameter $ . Then for any 
fixed £ g N we have 

A (T) -< * (3.16) 



provided that \T\ ^ £. (The threshold N (s 7 D) in Definition 2.4 may also depend on £). 

Proof. See Appendix [Xj □ 

In particular, if A -< ^ for some admissible ^, then Lemmas |3.8| and |3.5| imply that for any fixed £ E N 
we have 

i -< 1 ( 3 - 17 ) 

a 

provided that |T| ^ £. We conclude this section with rough bounds on the entries of G, which will be used 
to deal with exceptional, low-probability events. 

Lemma 3.9 (ROUGH BOUNDS on G). Suppose that A -< * for some admissible control parameter ^ . 
(i) We have 



for all z e S, T C {1,. . . ,N}, and i,j <£ T. 
(ii) For every pgN and £ £ N there is a constant C Pi £ such that 



(3.18) 



IE 



^ a 



p.C 



for all T C {1, . . . , N} satisfying \T\ ^ £, all z S S, and all i ^ T . 
PROOF. See Appendix [A] 



(3.19) 



□ 
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3.2 

orem 
|Ol 



Some ingredients of the proof of Proposition 3.3 A reader interested only in our main theorem (The- 
4.8) may skip this section and proceed to Section [4] directly. Here we sketch the proof of Proposition 



Our goal is to motivate some concepts underlying our main theorem, and to give an impressionistic 
overview of some ideas in its proof. The actual proof of Proposition |3.3| will not be needed, since Theorem 
4.8| implies Proposition |3.3| as a special case. 

To avoid needless complications in our proof, we additionally assume that we are dealing with one of 
the two classical symmetry classes of random matrices: real symmetric and complex Hermitian. For real 
symmetric band matrices we assume 



(ij € K for all i j . 



For complex Hermitian band matrices we assume 



Eg, = for all i < j 



(3.20) 



(3.21) 



A common way to satisfy (3.21) is to choose the real and imaginary parts of Qj to be independent with 
identical variance. In Remark 4.13 below we explain how to remove the assumption that (3.20) or (3.21) 



holds, i.e. how to remove the assumption Eg, = in the case (3.21 ) 



The second estimate of (3.5) follows trivially from \Gn a \ ^ A -< ^. We shall sketch the proofs of the 



remaining inequalities in the following order: 



(A) first estimate of (3.6) and second estimate of (3.7) 



(B) first estimate of (3.5) and first estimate of (3.7); 



(C) second estimate of (3.6) 



This order corresponds to an increasing degree of complication of the proofs. These three steps thus serve 
as simple examples in which to introduce four basic concepts underlying our proof. More specifically, in the 
language of the full proof (Sections [5] -[9]), (A) requires only the simple high-moment estimate from Section 
[6j (B) requires in addition the inversion of a stable self-consistent equation (Section 7.2), and (C) requires 



in addition a priori bounds on chains (Sections 7.2 and 7.1) as well as the procedure of vertex resolution 
(Section [8} . 

3.2.1. Proof of (A). We focus first on the first estimate of (3.6). We derive the stochastic bound from high- 



moment bounds and Chebyshev's inequality. To simplify the presentation, we only estimate the variance 



E 



L M 

*j ^ t Qa(G fj, a G a ^ 



x (m) 

^ ] E Q a \ G naGaix^ Qb (G ^fcGbp 



(3.22) 



Our goal is to prove that (3.22) is bounded by C^ 6 
a 7^ b. For the case a = b 



We partition the summation into the cases a — b and 
we easily get from Lemmas 3.6 and 3.9 the bound CN^ 1 ^ 4 ^ C*f? 6 , where we 



used (2.4) and the fact that satisfies Definition 2.9 
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Let us therefore focus on the case a ^ b. We use (3.13) to get 



^QaiGnaGa^QbiG^bGb^) 

G^bGba 



= EQa 
= EQa 



w fia 



G ( b ) 

w fia 



G 



bb 



r<( a )r< 



G 



(a) 
bb 



G (b) 



G ( b ) 



G a bG 



G 



bb 



^abt'bfi 



G 



(a) 
bb 



Qb 



G 



(a) 
fib 



G^aGat 

Gn.n. 



G. 



(a) 
b/j, 



GfraGa^ 
G a.n 



G 



(a) 
fib 



^ fia ^ab 



G 



G 



(a) 
bfi 



^ba^afi 

r {b) 
^ aa 



(3.23) 



where we dropped the higher order terms of the expansion. The philosophy behind this expansion is to 



make each resolvent entry independent of as many indices in (a, b) as possible by using (3.13) iteratively. 



We call such terms maximally expanded in (a, 6), i.e. a maximally expanded resolvent entry cannot be made 
independent of a or b using the identity (3.13); the reason is that either it already has a and b as upper 



indices or an index from (a, b) appears as a lower index. See Definition 6.4 below for a precise statement 



The iteration is stopped if either (3.13) cannot be applied to any resolvent entry or if a sufficient number of 
resolvent entries (in our case a total of six) have been generated. (In the proof of Proposition 6.3 we give a 
precise definition of this stopping rule.) 



We now multiply everything out on the right-hand side of (3.23) to get terms of the form EQ a (A)Qb{B). 
The key observation is that if B is independent of a then the expectation vanishes (in fact, already the 
partial expectation P a renders the whole term zero). Similarly, if A is independent of b then the expectation 
vanishes. An example of a leading-order term from (3.23) that does not vanish is 



<-Vfc U b 



G 



(«) 

bb 



■ G (b) 



Qi 



"-r^ia <J"ab ^(a) 
n {b) U bii 



(3.24) 



(Note that all resolvent entries are maximally expanded in (a, b).) In this fashion each Q imposes the presence 
of at least o ne a dditional off-diagonal entry. Since every off-diagonal resolvent entry contributes a factor \& 
(see Lemma 3.8), we find that (3.23) is of order vp 6 instead of the naive VE" 4 . This concludes the sketch of 



the proof of the first estimate of (3.6) 



The sketch of the proof of the second estimate of (3.7) is almost identical, and therefore omitted 



3.2.2. Introduction of graphs. Before moving on to (B) and (C), we take this opportunity to introduce a 
graphical language which is useful for keeping track of terms such as (3.24). Although not needed here, 



since the example in (A) is very simple, this language will prove essential when defining more complicated 
expressions, as well as for the actual proof of Theorem |4.8| Recall from Figure |3.1| that we can represent 
the expression G Ma G aAl graphically by regarding [i and a as vertices, and by drawing two directed edges 
associated with G pa and G afJi . We adopt the convention given after (3.10). Thus, an off-diagonal resolvent 
entry of G a b, a 7^ 6, is represented with a directed solid line from a to 6, and the analogous entry G* ab with 
a directed dashed line from a to b. 



Convention. We sometimes identify a vertex with its associated summation index, and hence use the letter 
a to denote two different things: a vertex of a graph and the value of the associated index. This allows us to 
avoid a proliferation of double subscripts in expressions like G aia . . When depicting graphs, we always label 
a vertex using the name of the associated index. 
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Figure 3.2. The graphical representations of resolvent entries. The versions associated with G* are the same with 
a dashed line. 



We shall also have to deal with diagonal resolvent entries; in fact we introduce separate notations the 
three most common functions of them. Our graphical conventions are summarized in Figure |3.2| We may 



thus represent the expression on the left-hand side of (3.23), i.e. Q a (GfiaGan)Qb(G* b Gl ), See Figure 



3.3 



note that our graphical notation does not keep track of the factors Q. Having drawn the graph in Figure 




"»6 



Figure 3.3. Graph associated with the monomial G^aGa^G^G^. Here we draw the case a / b 



3.3 we start making all resolvent entries (corresponding to edges) independent of the indices a and b, using 
the identities (3.13). As explained above, this gives rise to a sum of terms, each one of which is a fraction 



of resolvent entries that are maximally expanded in (a, b). The denominator of each term contains diagonal 
resolvent entries, while its numerator is a product of off-diagonal resolvent entries; this follows from the 



structure of (3.13). A simple such example was given in (3.24). The associated monomial, 



r<( a )r< 



G 



(a) 

bb 



Q(b) ^"M ^ba fjio.) 



<!fl 



may be represented graphically as in Figure 3.4 



n (b)* 
*~*aa 



(3.25) 




Figure 3.4. Graph associated with (3.251. Here we draw the case a^b 



We remark that the graphs depicted in Figures |3.3| and |3.4| are fundamentally different in the following 
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sense. In Figure 3J3 each edge of the graph represents a resolvent entry with no upper indices; in Figure 3.4 
each edge of the graph represents a resolvent entry that is maximally expanded in (a, b). In the language of 
Section[6j the former graph will be called 7 2 (A) while the latter will be called V. It is the latter graphs that 
play a major role in our proofs. The former type is simply a trivial concatenation of basic graphs, and serves 
as an intermediate step in the construction of graphs of the latter type (i.e. whose edges represent maximally 
expanded resolvent entries). If one wanted to be more precise, one could keep track of the upper indices 



associated with each edge in the graphs. By definition, the edges of the graph in Figure 3.3 have no upper 



indices, and the edges of the graph in Figure 3.4 have upper indices as given in (3.25). However, these upper 



indices are unambiguously determined by the condition that each resolvent entry be maximally expanded in 
(a, b) . This means that a appears as upper index of any edge that is not incident to a (and similarly with b) . 
In practice, however, we do not indicate the upper indices, as they are uniquely determined by the condition 
that all edges are maximally expanded in (a, b). 

It is possible, and indeed important for our proof, to introduce a graphical rule that generates graphs 
like the one depicted in Figure |3.4| from graphs like the one depicted in Figure |3.3| through a sequence of 
graphs whose edges are not yet maximally expanded. Before the maximal expansion is achieved, we shall 
temporarily indicate the upper indices on the graph edges in parenthesis. Recall that the underlying algebra 



was simply governed by the identities (3.13). Figure 3.5 depicts the identity 



G 



kh: 



Similarly, the corresponding identities for the diagonal entries, 



(3.26) 



(*0 



+ 




Figure 3.5. The graphical representation of the formula (3.261 



G , 



G 



is ji 
iiLx,-, (jr. 



(3.27) 



are depicted in Figure [3T6| Applying the graphical rules of Figures [33] and [X6| to Figure [373] results e.g. in 
Figure 3.4 (and many others). To be precise, we should keep track of the upper indices associated with each 



edge at each step, as is done in Figure [3lj) When all edges are maximally expanded, we stop the application 



of the rules of Figures 3.5 and 3.6 However, as explained above, we usually omit the explicit indication of 
upper indices in graphs after the maximal expansion is achieved. For future use, we record the following 
definition associated with the operations depicted in Figures |3.5| and |3.6| 
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+ 




Figure 3.6. Adding an upper index j to the diagonal entries 1/Gu and Ga. These pictures correspond to (3.271. 
We exceptionally also mark the upper indices associated with each edge. 



Definition 3.10. We refer to the second graph on the right-hand side of Figure [375] as arising from linking 
the edge (ij) with the vertex k. We also say that the vertex k was linked to by the edge {ij). Similarly, in 
both connected graphs in Figure 3.6 the vertex j was linked to by the edge (ii). 



The argument underlying (3.23) may now be formulated graphically as follows. We start from Figure 



3.3| and apply the identities from Figures |3.5| and |3.6| until all resolvent entries associated with the edges 
are maximally expanded in (a, 6). Since these identities can be applied in various orders, this procedure is 
not unique. This lack of uniqueness does not concern us, however: we need only a maximally expanded 



representation. By the argument given after (3.23), we know that only those graphs in which both a and 



b have been linked to by an edge survive. Such graphs (as the one from Figure |3.4| have (at least) two 
additional edges as compared to the one from Figure 3.3 This results in a size 0^(1F). 



3.2.3. Sketch of the proof of (B). We focus first on the first estimate of (3.5). The idea is to derive a stable 
self-consistent equation for the quantity 

1 O) 

Jj G^aGati • (3.28) 
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We do this by introducing the partition 1 = P a + Q a inside the summation. The second resulting term was 
estimated in (A). The first resulting term may be written as 



(aO 



rn 



m ' 



(«) 



a ^ x,y ' 

, (aO (<>) 

a x 
, («) 

n E E 8 «* G i>* G *» + cm* 3 ) 

a x 
, (At) 

-^G^G^ + O^ + iV- 1 ). 



In the first step we used (3.171. In the second step we used the identity (3.14a) (note the usefulness of 
smuggling in G aa in the previous step). In the third step we used the identity P a h xa h ay = Eh xa h ay = s ax S xy , 
as follows from the definition of H and the fact that G^ is independent of a. In the fourth step we used the 
identity (3.13) to remove the upper indices. In the fifth step we used a simple analysis of coinciding indices 
together with the estimates (2.2) and (3.11). Together with the bound from (A), we therefore get for the 
quantity (3.28) the self-consistent equation 

, (At) , (aO 

^J2 G ^ G ^ = ^E p «( g /- g ^) +c m* 3 ) 



N 



X 

1 (At) 



where in the last step we used (2.9). Using (3.4) and the trivial bound G^ a G a ^ -< ^ 2 we therefore 



get 



(At) 



N 



G naG af + -< mini * 2 , 



1 1 — TO 2 I 



sC min{* 2 , e* 2 (# + M~ 1/2 # -1 )} = f 2 $ . 



which is the first estimate of (3.5). 

The proof of the first estimate of (3.7) is similar, except that we derive the self-consistent equation using 
( 3.14c[ ) instead of (3.14a). Using the second estimate of (3.7) we find 

~ E( G - - ™) = ^E p ^ G -- m ) + ^E^ G - = ^E p «( G -- m ) +c M* 2 )- ( 3 - 29 ) 
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Next, from a simple large deviation esti mate (see the first paragraph in the proof of Lemma 9.1 in Appendix 
[a} wc find Z a < ^ . Moreover, Lemma 3_8 p3| |, ( [272] ), a nd |2~9| ) readily imply that -< Recalling 
the estimate (2.10), we may therefore expand the identity (3.14c) using (2.7) to get 

G aa = m + m 2 (-h aa + Z a + U^) + 0^ 2 ). 

Using P a h aa = and P a Z a — we therefore find 



a x 



where in the second step we recalled the definition (3.15) and used (2.2) as well as (2.3) to write m = 
Ei Q) s a X m + 0{M~ 1 ) with M" 1 = Q(# 2 ) by (pTJ), and in the third (|3.13[) to get rid of the upper index a 



as well as ( |2.3[ ). Thus, together with ( |3.29 1, we get the self-consistent equation 



(3.30) 



from which we easily conclude the first estimate of (3.7) as before. 

In both of the above examples the averaging was performed with respect to the uniform weight w a = N^ 1 . 
We conclude by sketching the differences in the case of a nontrivial weight, e.g. w a = s ua - Consider for 
example the average J2 a s va {G aa — to) from (3.8). Repeating the above derivation of (3.30), we find the 
self-consistent system of equations 



a 



^ Si/a, ^ $ax(G ' xx 

— to) + E v 



for each v. Here the error satisfies E v = 0^(^ ). Introducing the vectors v = (f a )o=i defined by v a 
J2 X s ax{G xx - to) and E = (E v )^ =1 , we have 



v = m 2 Sv + E. 



Thus we find 



V = (1 -TO^^E, 



from which we conclude that v a -< g^> 2 ^ ty$>. 

3.2.4- Sketch of the proof of ( C). As in (A), the proof is based on a high-moment estimate. We again restrict 
our attention to the variance 



E 



O) 



j7 Qa{G^gGl^) = E E Q a (G ^G^) QbjG^Gl^) 

a a,b 



(3.31) 
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Our goal is to derive the stochastic bound <J/ 6 <j> 2 for (3.31). The case a — b yields the bound 

1 (aO 



(3.32) 



Let us therefore assume for the following that a ^ b. The first part of the argument follows precisely the 
proof of (A) above. We expand all resolvent entries of 



EQ a (G Ma G* ) Q fc (G AI f,GL) 



(3.33) 



using (3.13) and obtain a sum of monomials whose resolvent entries are maximally expanded in (a,b). A 
typical example of a nonvanishing term arising from the expansion of (3.31) is 



G [a }G, 



G 



0) 
hi 



As in the proof of (A), this immediately gives the stochastic bound \1/ 6 . See Figure 3.7 for a graphical 
summary of the argument in this context. 





+ • 



Figure 3.7. The process of making all edges of the graph associated with ( 3.33 1 maximally expanded in (a, b) 



The bound 'J' 6 is not enough, however. In order to improve this to \I> 6 <I> 2 , we introduce a new operation 
which we call vertex resolution. In order to simplify the presentation, in the following we systematically 

(T) 

replace any diagonal entry G a a by m. The resulting error terms are small by definition of A (of course, 
they have to be dealt with, which is done Section 9.1 of the full proof below). Thus, we have to estimate the 
expression 

EQ a {G$G ba G£>)Q b {G$G* ba G£>) (3.34) 



for a ^ b. We begin by expanding all resolvent entries using the Family B identity (3.14a), again neglecting 



the diagonal prefactors in (3.14a). This gives 



(ab) {ab) 

^ ^ ^Qa(G^h x bhb y G^h za h aw G^*) Qb(G^) h x >bhb y >G^ b J* h z > a h aw >G^! 

x,y,z,w x' ,y f ,z f ,w f 



(ab)* 



(3.35) 

(Here we also ignored a few special cases of coinciding indices when expanding both a and b in G a b using 



(3.14a). As usual, the resulting terms are subleading and unimportant for this sketchy discussion.) The idea 
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behind (3.35) is to expand all of the randomness that depends on a and b explicitly (i.e. in entries of H), so 
that partial expectations may be explicitly taken. Note that all resolvent entries in ( |3.35 1 are independent 
of a and b. We may now take the expectation E in (3.35); more precisely, wc reorganize (3.35) as 



(ab) 



(ab) 



■rff ri(ab) fi(ab) s~f(ab)* /~i{ab) ^(ab)* ^(ab)* p 
x,y,z,w x',y',z',w' 



} a{^za^aw^)h z r a h a 



Ph 



h x bhbyQb(h x 'bhby') 



The two square brackets in (3.35) may be computed explicitly. Since Eh v 
(at least) be paired with another copy of the same factor or its conjugate h v 



(3.36) 

0, each matrix entry h uv must 
h vu . Assume first that we 



are dealing with a complex Hermitian band matrix (condition (3.21 )). In that case, each h uv must be paired 
with its conjugate h vu since E/i 2 t , = 0. Of course, it may happen that more than two entries have coinciding 
indices, but this leads to a term that is subleading by a factor M™ 1 / 2 , and which we neglect here. Thus, 
h Z a hi (3.36) may be paired with h aw (resulting in z — w) or with h aw > (resulting in z = w'). However, the 
pairing h za with h aw gives a vanishing contribution owing to the presence of Q a , since 



^zaQa {hzahaz ) 







where E zo denotes partial expectation with respect to h za . In other words, Q a forbids the pairing of h za 
with h aw , and similarly Qb the pairing of h x >b with hb y >. Thus the leading order term resulting from the 
square brackets in (3.36), on which we focus here, is the pairing 



uSbx^by $wz' & zw' &x' y&xy' 



(3.37) 



In the real symmetric case (condition (3.20)), where E/i 2 ^ does not vanish, h za can also be paired with h z > a . 
(Note that Q a still forbids the pairing of h za with h aw .) This yields the three further allowed pairings 



jSbxSby^zz' $ww f $x'y $xy f : 



jSbxSbyfiwz' & zw' &xx' 3yy' j S az S aw SbxSby3zz' &ww' &xx' &y 



(3.38) 



Assuming again condition (3.21 ), only (3.37) contributes, and we get the expression (up to lower order error 
terms in M -1 / 2 ) 



(ab) 
x,y,z,w 



c q F ^{ab) ^(ab) sy(ab)* (^t(ab) s-i(ab)* ^i(ab)* 
^az^aw ^ fj, x ^yz ^wfi ^ fiy u ™ ^zfi 



= E 




r<{ab) Mo-b)* Mab)* \ I (~<(ab) n(ab) Wa6)* 

X^^x U ll« U U)/J I I / j a by^aZ^ f_iy <-> y Z ^Zfl 

/ \ y,z 



(3.39) 



Now each of the expressions in the parentheses is stochastically bounded by ^ 3 <I>. Indeed, an argument very 
similar to the proof of (B) above yields 



52 Steffi) GW < * 2 $. 



(3.40) 



(The additional upper indices (ab) are unimportant.) Thus, from the summation over y in (3.39) we gain an 
additional factor <!> (and, similarly, from the summation over w). We therefore find that (3.33) is stochasti- 
cally bounded by v]/ 6 ^ 2 ^ which was the claim of (C). 
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Figure 3.8. The graphical notation for entries of G, G* , H, and S. Since S is symmetric, the edge associated with 
Sat is undirected. 



The use of graphs greatly clarifies the mechanism underlying the above sketch of the proof of (C). In 
order to depict vertex resolution, we need a graphical notation for edges associated with matrix entries h uv 
and s ao ; we represent the former using dotted lines and the latter using wiggly lines. See Figure |3~8| As seen 
above, the starting point for the operation of vertex resolution is (3.34). The expanded expression (3.35) 
may be graphically represented as in Figure 3.9 Thus, the vertex a is "resolved" into four (i.e. the degree 
of a) new vertices, which are drawn in white and are connected to their parent vertex a by dotted lines 
(corresponding to a matrix entry of H ) . White vertices are either incoming or outgoing, depending on the 
orientation of the dotted edge that joins them to their parent vertex a. Similarly, the vertex b is resolved 
into four new vertices. Note that each solid or dashed edge in the right-hand graph of Figure [X9| represents 




a resolvent matrix entry that is independent of a and b. The expression (3.39) was obtained from (3.35) 
by computing the partial expectations P a and P of the associated entries of H . Graphically, this amounts 
to a pairing of the white vertices surrounding each black parent vertex. (Note that the factors Q, which 
yielded constraints on the allowed pairings, are not visible in the graphs. This is not a problem, however, 
as the ensuing bounds will hold for all pairings, even if these restrictions are relaxed.) The pairing of two 
dotted lines gives rise to a wiggly line, in accordance with the identity E a |/i a:E | 2 = s ax . See Figure 3.10 for a 



graphical representation of the pairing in (3.39). In Figure 3.10 we represented the pairing (3.37), which is 
the only one in the complex Hermitian case (3.21 ). In this case, the orientation of the edges must be matched 
when pairing white vertices, i.e. an incoming white vertex can only be paired with an outgoing one. This is 

In the real symmetric case 

ml 



an immediate consequence of the condition Eh 

'<>,x\ — — •"«.> ■ 



(3.20), where E a \h a 



s ax , the other pairings (3.38) are also possible. Graphically, this means 



that, when pairing white vertices, there are no constraints on the orientation of the incident edges. In other 
words, the arrows on the dotted edges may be ignored. 
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Figure 3.10. Taking a pairing of the white vertices to get the completed resolution. 



The result of the vertex resolution is graphically evident when comparing the first graph in Figure [379] and 
second graph of Figure 3.10 the vertex a, of degree four, has been split (or "resolved") into two vertices of 
degree two. (The same happened for b). This resolution also entails the creation of new summation indices, 
z and w. Each one of them is connected to the original vertex a by a factor s az (respectively s aw ), which 
implies that the summation over the larger family of summation indices is still performed with respect to a 
normalized weight. Generally, vertex resolution splits vertices of high degree into several vertices of degree 
two. The reason why this helps is that we can gain an extra factor $ from any summation vertex of degree 
two whose incident edges are of the same "colour" (solid or dashed). We shall use the name marked vertex 
(see Definition 



below) to denote a vertex whose resolution yields at least one new summation vertex 
whose (two) incident edges are of the same colour. The mechanism behind the gain of a factor <!> from a 
newly created (via resolution) index is roughly the content of (B), and was used in (3.40|. In our case, 
we gain from the summations over y and w (but not z or x). Generally, the process of vertex resolution 
yields long "chains" (i.e. subgraphs whose vertices have degree two), each vertex of which yields an extra 
factor $ provided both incident edges have the same colour. In fact, establishing such estimates for chains 
is an important step in our proof (see Proposition |5.3| below) . This concludes our overview of the proof of 
Proposition |3.3| 



4. General monomials and main result 



In this section we state the fluctuation averaging theorem in full generality. To that end, we introduce a 
general class of monomials which we shall average. We consider monomials in the variables 

Qij(z) := Gij{z) - 5 i:j m(z) , 



which yield a more consistent power counting for diagonal resolvent entries. Indeed, by definition \Gij\ ^ A 
for all i and j. As we saw in Section [3j monomials in the resolvent entries are best described using graphs; 



see (3.10) and Figure 3.1 



We may now define the graphs A used to describe monomials. 
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Definition 4.1 (Admissible graphs). (i) Let V s and V e be finite disjoint sets. Let V ■= V s U V e be their 
disjoint uniorj^jand E be a subset of the ordered pairs V x V. The quadruple 

A = (y„v e ,E,t) 

is an admissible graph if it is a directed, edge-coloured, multigraph with set of vertices V. The edges 
are ordered pairs of vertices with multiplicity, i.e. we allow loops and multiple edges. We shall also use 
the notation V s = V S (A), V e = V e (A), and E = E(A). 

More formally, we can view E(A) as an arbitrary finite set equipped with maps a,/3 : E(A) —> V(A). 
Here a(e) and /3(e) represent the source and target vertices of the edge e £ E(A). The colouring 
£ : E(A) —> {1,*} is a mapping that assigns one of two "colours", 1 or *, to each edge. If no 
confusion is possible with the multiplicity of an edge e £ E(A), we shall identify it with the ordered 
pair (a(e),/3(e)). 

(ii) We denote by 3 the set of admissible graphs A on arbitrary V s and V e . 

(iii) The degree of A £ 3 is 

deg(A) := \E(A)\. 

The set V^A) will label the family of summation indices ((a, b) — (<Jj)iey s (A) m thfi example (3.9)), and 
V e (A) the set of external indices ((/x, f) — (/ij)iei4(A) m the example (3.9)). We use the notation 

u = (a,ji), a = (aj) ie vi(A), M = (Mi)i£V e (A) (4- 1 ) 

for the matrix indices. Generally, we try to use Latin letters a, 6, c, (i, x,y, z, . . . for summation indices and 
Greek letters /i, v ... for external indices. 

Although our statements and proofs hold for any admissible graph A, in order to avoid trivial cases in 
our applications we shall always consider graphs without isolated vertices and with the property that each 
edge is incident to at least one vertex V S (A), i.e. every resolvent entry contains at least one summation index. 

Next, we introduce the monomials in (Gxy) whose average we shall estimate. 

Definition 4.2 (Monomials). Let A £ 3 be an admissible graph and let fx £ {1, ... , N}^^ be a collection 
of external indices. We define the monomial 

Z a ee ZZ{A) := J] Gt a(M) (4.2) 



which is regarded as a function of the summation indices a, recalling the splitting of the indices (4.1). We 
also denote by 

Z = (Z£(A) :ae{l,..,JV} y '( A ') 
the family of monomials associated with (A,/x) by (4.2), and say that A encodes the monomial Z & . 

Note that deg(A) is the degree of the monomial Z a encoded by A. Throughout the following we shall 
frequently drop the explicit dependence of Z a on fi and A. 

The averaging over a will be performed with respect to a weight w(a). In the example (3.9), this weight 
was w(a, b) = s^aSpt,. A typical example of a weight is 

w(a,b,c) = -^^2s^ d s db s bc for a = (a, b, c) . (4.3) 

d 



Here, and throughout the following, we use the symbol U to denote disjoint union. 
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In order to define a general class of weights, the following notion of partitioning of summation indices is 
helpful. 

Definition 4.3 (Partition of indices). Let I be a finite index set. For a = (aj)j e j e {1, . . . , N} 1 we denote 
by 'P(a) the partition of I defined by the equivalence relation k ~ I if and only if = a;. 

Generally, we consider weights satisfying the following definition; when reading it, it is good to keep 
examples of the type ( |4.3[ ) in mind. 

Definition 4.4 (Weights). A map w ■ {1, . . . , N} Vs( - A ^ — >• [0, 1] is a weight adapted to A e 3 if it satisfies the 
following condition. Let V S (A) = / U J be a (possibly trivial) partition of V S (A) into two disjoint subsets, 
inducing a splitting a = (a/, aj) of the summation indices. Then we require that, for any partition P of J, 
we have 

max^l(-p(a, 7 ) = P) iu(a 7 , aj) M \P\~Ws(A)\ ^ ^ 



where \P\ denotes the number of blocks in P. 



The interpretation of (4.4 1 is that the left-hand side of (4.4 1 has |P| free summation indices; the remaining 



summation indices have been either frozen (i.e. they belong to a/) or merged with others (i.e. they belong 



to a nontrivial block of P). Then (4.4 1 simply states that each suppressed summation yields a factor M 1 . 
In particular, with J — V S (A) and the trivial atomic partition P we have 

y>(a) sS 1, 



i.e. the total sum of all weights is always bounded by one. 



When estimating averages such as (3.9), we shall always impose that all indices that have distinct names 



also have distinct values. In the case that two indices have the same value, we give them the same name. 
Thus, for example we write 



j2 ^ , G fiaG abGbfi 



N 2 



a . b 



l 



a.b 



i 



(m) 



, Gn a G aa G a ij 



i 



G iiaG au,G n 



l 



O) 



b 



— G 3 

jy2 MM ' 



a a 

where a star on top of a summation means that all summat ion indices are constrained to be distinct. (Recall 



2.2 



also the notation £ (5) for S C {1, . . . , N} fr om Definition 

We may now define our central quantity. Let A g 3 and fi 6 {1, . . . , N} Vc ( AS> be a collection of external 
indices. Let F C V S (A) and w be a weight adapted to A. We define 



A^(A) ee X^iA) 



(m)* 

E 



w(a) 



Z£{A) 



(4.5) 



Thus, F denotes the set of summation indices that come with an operator Q. As explained above, the 
symbol (fi) on top of them sum means that aj 7^ fij for all i € V S (A) and j € 14(A), and the star means 
that aj 7^ aj for all distinct i,j E 14(A). Throughout the following, we shall frequently drop the explicit 
dependence of Xp' >1 (A) on fi. 
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Remark 4.5. In (4.5) each operator Q acts on all resolvent entries in Z a . We make this choice to simplify 
the presentation; also, this is sufficient for all of our current applications. However, our results may be easily 
extended to more complicated quantities, in which each Q acts only on a subset of the resolvent entries in 
Z a . Thus, in general, there a resolvent entry is either outside or inside Q ai , for each i £ F. We require that 
each resolvent entry outside Q ai have no index d{, and at least one resolvent entry inside Q ai have an index 
Oj. Then our proof carries over with merely cosmetic changes. For example, expressions such as 



^2 S HaSpbSbcQ a {^G \iaQb(G a bG bu)Q c{G ac G c d)G ' dfi 



may be estimated in this fashion. 

From Lemmas |3.6| and |3.9[ we find the trivial bound 

X%(A) ~< * dc s( A ) 



(4.6) 



for any adapted weight w, provided that A -< ^> . We call (4.6) trivial because we also have the bound 

Z£{A) ~< * d °s( A ). 



Hence the estimate (4.6) has not been improved by the averaging over a. 

Next, we define indices which count the gain in the size of Xp(A) resulting from the averaging over a 
and from the factors Q. 



Definition 4.6. Let A be an edge-coloured graph as in Definition 4.1 For i £ V(A) we set 



Vi {A) := Y, 1 ^W,*) = !) = 3) + 1(* = k ) 

Y =*) [l(i = j) + l(i = k) 

(j,fe)e£(A) 



"i(A) 



Informally, v%{A) is the number of legs of colour 1 incident to i, and v*{A) the number of legs of colour * 
incident to i. 

We shall use deg(i) = deg A (z) to denote the degree of the vertex i £ V(A). It is sometimes important to 
emphasize that this degree is computed with respect to the graph A, which we indicate using the subscript^] 
A. By definition, deg A (i) is the number of legs incident to i, i.e. a loop at i counts twice. In particular, 
deg A (i) = ^(A) + ^(A). 

In terms of the monomials Z encoded by A, the index fj(A) (respectively v*(A)) is the number of 
resolvent entries of Q (respectively of Q*) in which the index a, appears. (Note that if the index appears 
twice in a resolvent entry, this entry is counted twice.) 

Definition 4.7 (Charged vertex). We call a summation vertex i £ V S (A) charged if either 

(i) i £ F and Vi ^ v* , or 

(ii) i £ F and \v, -v*\^ 2. 

3 Of course, deg A is not the same as deg(A). In fact, we have deg(A) = | SigvfA) ^ e §A(*)- 
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We denote by V C (A) C V S (A) the set of charged vertices. 

We may now state our main result. 
Theorem 4.8 (Averaging theorem). Suppose that A -< * for some admissible control parameter ^. Let 



A G 3 (recall Definitions 4-1 and \4-%fy and F C V S (A). Then 

Xp ,tJ, (A) -< * de s( A )+l F l <j>l^(A)| 



(4.7) 



for any fi and weight w adapted to A (recall Definition 4-4 ) 



Thus, Theorem |4.8| states that we gain a factor "J from each Q and a factor $ from each charged vertex. 
The rationale behind the name "charged" is that, in the vertex resolution process from the proof of Theorem 
|4.8[ a charged vertex gives rise, in leading order, to a collection vertices of degree two, at least one of which 
will be a chain vertex (see Definition 5.1) and hence yield a factor $ using the a priori bounds of Section [7] 



Remark 4.9. The right-hand side of (4.7 1 can be estimated from above by 



(* + 7\/-l/4)dcg(A) + |F| + |V c (A)| , 

which gives a simple power counting in terms of the quantity + AT -1 / 4 . From each summation index 
without an associated Q ai we gain a factor * + M- 1 / 4 if v t f v*. If there is a Q ai then we gain at least a 
factor + A/ -1 / 4 , and, provided that \vi — v*\ ^ 2, one additional factor ^ + A/ -1 / 4 . Note that we gain at 
most two additional factors "J + Af -1 / 4 from each summation index. 



Remark 4.10. As explained after (3.6), the additional term M 1 / 2 $ 1 in the definition of $ is a (necessary) 



technical nuisance and should be thought of as a lower order term in typical applications. In general, however, 



it cannot be eliminated, and Theorem 4.8 cannot be formulated in terms of powers of VP alone. This may be 
seen for instance from the variance calculation of the quantity jj J^i QaiG^aG^^). Indeed, as is apparent 
from (3.32), the term arising from a = b is of order A^ -1 ^ 4 , which is in general not bounded by 'J 8 . 



Remark 4.11. The requirement that (2.5) hold for all p can be easily relaxed. Indeed, Theorem 4.8 has 



the following variant. Fix e > and D > 0. Then there exists a p(e, D) € N such that the following holds. 
Suppose that the hypotheses of Theorem 4.8 hold, and that (2.5) holds for p(e,D). Then 



\X^(A) 



> 



N s ^dcg(A) + |F| $ |V (A)| 



^ N~ 



for all z £ S, all fi, and all weights w adapted to A. 

This variant is an immediate consequence of the proof of Theorem 4.8 using the observation that, for 
any fixed e and D, the estimate on Xp(A) consists of a finite number of steps s, each of them using a bound 
on E|Cij| Ps for some finite p s . As e — > or D —> oo, the number of these steps tends to infinity. Moreover, 
as the step index s tends to infinity, the exponent p s in E|£jj| Ps also tends to infinity. 

Remark 4.12. Our result applies verbatim if (some or all) diagonal entries of the form Qu = Ga — m in the 
monomial (4.2 1 are replaced by 1/Gu — 1/m. (This would be a mere notational complication in the statement 
of Theorem 4.8). After a little algebra (multiplying out a product of terms of the form 1/Gn — 1/m), we 
consequently find that our result applies to monomials divided by diagonal entries Ga, i.e. expressions of 
the form 

QxyOuvQwz 



GaaGbbG 
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where the indices can be either summation or external indices. This extension may be proved in two ways. 
The first way is to observe that if we replace the identity 



l/m- (-hu + Zi + U^) 



— m = m 



■>(-h u + z i + uP) + 



i 3 (-h lt + Z t + U^) 



used in our proof by ( |3. 14c ) for the quantity 1/G.; 
The second way is to write 



1/m, the proof of Theorem 4.8 carries over unchanged 



G, 



1 

rn 



(m~G u ) 2 , (m~G u ) 3 



i 3 G,.. 



This induces a splitting of Z into three parts, which are treated separately. It is a simple matter to check 
that Theorem 4.8 may be applied to the first two parts. The third part is treated trivially, by freezing 
the index i; in this case we already get a factor \1/ 3 from the index i, and hence the averaging effect of the 
summation over i is not needed, since we already gained the maximal two additional factors of ^ from i. 



Remark 4.13. As in Section |3j in our proofs we shall assume that either (3.20) or (3.211 holds (see Sec- 



tion 



3.2). We impose these conditions in order to simplify the derivation and analysis of self-consistent 



equations such as the ones in Sections |3.2.3| and |7.1| Without them, however, our core argument remains un- 
changed. For instance, when estimating ^ a Sb a G ^ a G ^a, we instead consider the quantity V a '■= PaG^G^a- 
Using (3.14a), we may do a calculation similar to the one following (3.28), and get a self-consistent equation 
for V a . Solving the self-consistent equation entails the analysis of the Hermitian operator R = (r#) where 



Using 



Si 



, the spectral analysis from the end of Section 7.2 and Appendix |A| carries over 



with minor modifications. We omit the extraneous details of this generalization 
Remark 4.14. In [17j Lemma 5.2], a fluctuation averaging theorem of the form 

W 

i k.l 



(4.8) 



was proved. This result was further generalized in |5| |16||18 



To see this, we use Schur's formula (3.12) to get 



The estimate (4.8) also follows from Theorem 



(i) 



^£^4 = ^E^-^E^E^g 

i i i k,l 



(i) h 

kl n u 



(4.9) 



The second term on the right-hand side of (4.9) is the left-hand si de of ( |4.8[ ) 
of (4.9) is stochastically bounded by 'J 2 , as follows from Theorem 



The first term on the right-hand side of (4.9) is easily proved to be stochastically bounded by iV ^ ^> . 

More over, the left-hand side 
In fact, the left- 



see Remark 



hand side of (4.9) may be estimated using the much simpler Proposition 6.1 (whose proof trivially holds for 
expressions like the one the left-hand side of (4.9)). In particular, Proposition 6.1 and this remark provide 



4.12 



a simpler proof than [5 16 -18 of the previously known estimate (4 



Theorem 4.8 has the following, simpler, variant in which the averaging with respect to a weight w is 
replaced with partial expectation. 
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Theorem 4.15 (Averaging using partial expectation). Suppose that A -< * f or SO me admissible 
control parameter ^ . Let A G 3- and F = 0. Then 



Y[PaZ£(A) -< $«tog(A) $ |V,(A)| 



(4.10) 



for all a and fi such that all indices of the collection (a, ju) are distinct. 



Thus in Theorem 4.15 we set F = 0, i.e. there are no factors Q, whose presence would be nonsensical 
because the identity P a Q a = implies that the partial expectation of any monomial preceded by a factor Q 
vanishes. The condition F = is still used indirectly in the theorem since the definition of V C (A) depends 
on F. 



Remark 4.16. It is possible to combine Theorems 4.8 and 4.15 by splitting a = (a', a"), and averaging over 
a' with respect to a weights w(a') and taking the partial expectation Jlaea" -^o over a "- We om it the details. 

Remark 4.17. Remarks |4.9| — |4~L3| also apply to Theorem |4. 15| with the obvious modifications. 



5. Outline of proof 



We now outline the strategy behind the proof of Theorem |4.8| The first part of the proof relies on an 



inductive argument to prove the claim of Theorem 4.8 for a special class of A's (the chains) that encode 



monomials containing only factors Q and not Q* (or the other way around). These A's act as building blocks 
which are used to estimate the error terms arising in the estimate of arbitrary A's, in the second part of the 
proof. The need to have a priori bounds on chains was already hinted at in Scction[3.2| Indeed, the estimate 



(3.40) is the simplest prototype of a chain estimate, and was used to estimate quantities arising from the 
process of vertex resolution. This is in fact a general phenomenon: a priori bounds on chains will be used 
used in combination with vertex resolution. 

Definition 5.1 (Chains). Let A e 3- 

(i) We call a vertex i € V^ S (A) a chain vertex if i is not adjacent to itself, i has degree two, and both 
incident edges have the same colour. We denote by c(A) the number of chain vertices in A. 

(ii) We call A an open (undirected) chain if all vertices i €E ^ S (A) are chain vertices, |V e (A)| = 2, and 
deg(i) = 1 for both i e V e (A). 

(hi) We call A a closed (undirected) chain if all vertices i € V^(A) are chain vertices, |V e (A)| ^ 1, and 
deg(i) = 2 for i e V e (A). 

(iv) A chain vertex i € V S (A) is directed if one incident edge is incoming and the other outgoing. A chain 
is directed if every i £ V^(A) is directed. 

Figure |5.1| gives a few examples of chains. The notion of a directed chain will be used in the complex 



Hermitian case (3.21 ), in which all chains that arise in our proof will be directed. In the real symmetric case 
(3.20), there is no such restriction. 

If A is a chain then by definition Aj^(A) contains no diagonal entries Qa- Since Qij — G-^ for i ^ j : we 
may (and shall) therefore replace all entries of Q with entries of G when A is a chain. 

Chains are useful in combination with the following family of special weights. 
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Figure 5.1. From left to right: an open directed chain, a closed undirected chain with one external vertex, a closed 
directed chain with no external vertices. 



Definition 5.2 (Chain weights). Let n e N. For any fixed b = (61, ... , 6„) define the weight 

tdb(a) = w(a) := s aibl ■ ■ ■ Sa n b n ■ 



(5.1) 



We call weights of the form (5.1 1 chain weights. 



Using ( 2.2 ), it is easy to check that a chain weight from Definition 5.2 is a weight in the sense of Definition 



|4.4| The role of chains is highlighted by the two following facts. 

• If A is a chain and u>b is an adapted chain weight, then the family (J2 a u>b(a)J? a (A)) fe for fixed 
(6 2 , • ■ ■ , b n -i) satisfies a stable self-consistent equation; See Step I2 below. 



Proving Theorem 4.8 (in fact, a weaker version given in Proposition 5.3 below) for a chain A and an 
adapted chain weight is a key tool for proving Theorem |4.8| for arbitrary A. 



Proposition 5.3. Suppose that A -< ^> for some admissible control parameter 'J , and recall the definition 

(5.2) 



(3.1) 0/$. Let IS. be a chain, w an adapted chain weight, and F C V S (A). Then we have 



for any fi and adapted chain weight w. 



As an a priori bound in Sections [8] and [9j we shall always use Proposition 5.3 with F = 0. The statement 
of Proposition 5.3 for F — may be summarized by saying that from each chain vertex we gain a factor $ 



(as compared to the trivial bound ( |4.6[ )). 

Next, we outline the proof Theorem |4.8[ The argument consists of two main steps: establishing a priori 



bounds on chains (i.e. proving Proposition 5.3) and proving Theorem 4.8 using Proposition 5.3 as input. 

Proposition |5.3| is proved first for open chains, using a two-step induction. The induction parameter is 
the length of the chain I :— deg(A). The induction is started at £ = 1, and consists of two steps, Ii and I2. 
It may be summarized in the form 



((. = 1, F = %) ^ {1 = 2, F ^ {i = 2, F = %) ^ {l = i, F ^ 
What follows is a sketch of steps Ii and I2. 



^(^ = 3,^ = 0)^ 
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Step Ix. The input for Step I2 is the claim of Proposition 5.3 with F = 0, for all open chains A' satisfying 
deg(A') < I. Using a high moment expansion, we estimate Xp(A), where A is an open chain, 
deg(A) = £, and F ^ 0. The details are carried out in Section 7.1 



Step I2. We fix an open chain A and prove the claim of Proposition 5.3 for F 
that the claim of Proposition |5.3| has been established for 



under the assumption 



(i) A with F ^ 0; 

(ii) all open chains A' satisfying deg(A') < deg(A) with F = 0. 

The proof is based on a self-consistent equation for the family (X) a w b{&)Z a ) b for fixed 62, . . . ,b n . 
This self-consistent equation will be stable provided E = Rez lies away from the spectral edges ±2. 
This stability is ensured by the fact that Z only contains factors G and not G* . The details are carried 
out in Section PL2l 



The induction is started by noting that Proposition |5.3| holds trivially for the open chain of length 1 
(which has no chain vertex), encoding the monomial G^ v < \&. After Steps ±1 and I2 are complete, the 
induction argument outlined above completes the proof of Proposition 5.3 for open chains. The proof for 
closed chains is almost identical, except that no induction is needed; the only required assumption is that 
Proposition |5.3| hold for open chains of arbitrary degree. 

Once Proposition |5.3| has been proved, we use it as input to prove Theorem |4.8| for a general A e 3- 
Similarly to Step ±2, we use a high- moment expansion. The estimates are considerably more involved than 



in Step I2, however. (In the language of Sections 3.2.4 and[8j we use vertex resolution to gain extra powers 
of VP from the charged vertices.) The details are carried out in Sections [8] - |9l 

We record the following guiding principle for the entire proof of Theorem |4.8| It is a basic power counting 
that can be summarized as follows. The size of Xp(A) is given by a product of three main ingredients: 

(a) The naive size \]/ do s( A ) j which is simply the number of entries of Q in Xp(A) (obtained by a trivial 
power counting and Ax$). 

(b) The smallness arising from F, i.e. >]/l F l (obtained from the linking imposed by the factors Q). 

(c) The smallness arising from the charged vertices, i.e. $l yc ( A )l (obtained from vertex resolution and the 
a priori bounds of Proposition 5.3 applied to chain vertices). 



We shall frequently refer to the factors 5 , ' F ' and $1^(^)1 from (b) and (c) as gain over the naive size \|/ de s( A ). 
It is very important for the whole proof that the mechanism of this gain is local in the graph, i.e. operates 
on the level of individual vertices. Each factor gained in the case (c) can be associated with a charged 
vertex. In the case (b), a linking results in an additional edge adjacent to the vertex on which a linking 
was performed. There will be some technical complications which somewhat obscure this picture, such as 
occasionally coinciding indices. We shall always analyse these exceptional situations by comparing them to 
the basic power counting dictated by the generic situation. We remark that these "exceptional" situations 
sometimes in fact lead to leading-order error terms, which is for instance the reason why the parameter <!> 



cannot in general be replaced with in (4.7) 



Figure |5.2| contains a diagram summarizing all key steps of the proof. 

We conclude this section with an outline of Sections [6] -[9] In Section [6] we present a simple high- moment 
estimate that only uses the process of linking (see Definition 3.10); more algebraically, the argument of 
Section [6] only uses Family A identities (and not Family B). The result is Proposition 6.1 which obtains a 



32 



Sect. 6 



(High-moment estimates using linking only 




Warmup: Proposition 6.1 



Sect. 7.1 



(^Stable self-consistent equation 



Sect. 7.2 



► f I 



Proposition 5.3 for open chain of degree 



= 1, F = 



e = 2, f 



= 3, F = 




1 = 2, F ^ 



£ = 3,F? 



Sect. 7.3 



Proposition 5.3 for closed chains 



Sect. 7.3 



Proposition 5.3 for open chains 




Theorem 4.8 under simplifications (SI) - (S4) 



Sect. 9 



► Theorem 4.. 



Figure 5.2. The structure of the proof of Theorem 4.8 Concepts and arguments are displayed in rounded boxes, 
statements and results in rectangular boxes. 



gain of a factor <3> from each Q but no gain from charged vertices (see Definition 4.7). The goal of Section 
[6] is twofold, the first goal being pedagogical. It provides a complete but vastly simplified proof of a special 
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case of Theorem |4.8| thereby illustrating the process of linking. In addition, it lays the ground for Step Ii 
used to derive a priori bounds on chains, as well as for the more complicated high-moment estimates used 
in the full proof of Theorem |4.8| 

Section 6 is devoted to chains; its goal is to prove Proposition |5.3| Step Ii is proved in Section [7T] and 
Step I2 in Section |7.2| The induction, and hence the proof of Proposition |5.3| is completed in Section |7.3| 
In Section [8] we prove Theorem 4.8 under four simplifying assumptions, (SI) - (S4) listed in Sections [6] and 
[8j These simplifications allow us to ignore some additional complications, and give a streamlined argument 
in which the fundamental mechanism is evident. The starting point for the argument in Section [8] is the 
high-moment expansion using vertex linking, already introduced in Section [6] In addition, we make use 
of Family B identities, which leads us to the process of vertex resolution (sketched in Section [3.2.4[ ). In 



Section |9 
Theorem 
of Theorem KE\ 



we present the additional arguments needed to drop Simplifications (SI) - (S4), and hence prove 
4.8|in full generality. Finally, in Section 10 we prove Theorem 4.15 as a relatively easy consequence 



6. WARMUP: SIMPLE HIGH-MOMENT ESTIMATES 



We now move on to the high-moment estimates which underlie our proofs. The idea is to derive high- 
probability bounds on Xp(A) by controlling its high moments using a graphical expansion scheme. 

For pedagogical reasons, we shall throughout the following selectively ignore some complications so as to 
make the core strategy clearer. We shall eventually put back the complications one by one. In this section 
we consistently assume the following simplification. 



(SI) All summation indices in the expanded summation E|A£!(A)| P (see (6.5) below) are distinct. (I.e. we 
ignore repeated indices which give rise to a smaller combinatorics of the summation.) 

In this section we present a simple argument which proves the following weaker estimate. 

Proposition 6.1. Suppose that A -< '5 for some admissible control parameter \E' ; A £ 3> o,nd w is an adapted 
weight. Then for all F C V S (A) and \i we have 



(6.1) 



The estimate (6.1) expresses that from each Q in Xp(A) one gains an additional factor ^. 



Remark 6.2. As in Remark 4.12 the statement of Proposition 6.1 remains true if some (or all) diagonal 
entries of the form Q aa = G aa — m are replaced by 1/G aa — 1/m. The proof is exactly the same. 



The simplified argument behind the proof of Proposition 6.1 uses only the Family A identities, i.e. (3.13 ). 
It relies on a high-moment estimate of the following form. The precise statement is somewhat complicated 
by the need to keep track of low-probability exceptional events. The sum over T £ & in Lemma [6~3| will arise 
as a summation over graphs. 

Lemma 6.3. Suppose that A -< '5 for some admissible control parameter i$>, and let p € 2N be even. Then 
we have 

E\Xf(A)\* < ]TEXr, (6.2) 
re© 
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where 25 is a finite set (depending on A, F , and p) and Xr is a random variable satisfying 

x r -< ^p( d °g( A )+i F i) 

as well as the rough bound 

E\X r \ 2 ^ N Cp 

for some constant C p . 

Before proving Lemma |6.3| we show how it implies Proposition |6.1| 



(6.3) 
(6.4) 



Proof of Proposition 16. 11 Let e > and D > be given. Define p as the smallest even number greater 
than 4:D/e, and abbreviate q ■= deg(A) + \F\. Then by Lemma [63{ for each T e © there exists an event Sr 
such that 

pr r |l(S r ) < N ep 'H pq , P(H£) < AT-Cp-M 
for all w, //, and z £ S. Then we find, using Lemma |6.3| again, 

ELY£(A)|p < 5^(E(X r l(S r ))+E(X r l(Sf))) 
ree 

€ (N £p / 2 V pq + (ElJCrl 2 ) 172 ?^) 1 / 2 
" s 

< 2\<5\N £p/2 ^ pq , 
for all w, /i, and zgS. Therefore Chebyshev's inequality gives 



ree 

ss lei 



"(\X^{A)\> N e ^ sc 2\<S\N- ep/2 < 



N~ 



for all w, /i, and z G S. 



□ 



The rest of this section is devoted to the proof of Lemma |6.3| All of our estimates will be uniform in 
w and /j,, and we shall henceforth no longer mention this explicitly. Throughout this section we assume 
Simplification (SI). 

Proof of Lemma 16.31 The idea of the proof was already outlined in Section |3.2.1| Let A £ 3 have n 
summation indices, denoted by a±, . . . , a n , and k external indices, denoted by p:%, . . . Let Fc{l,...,n}. 
Let p £ 2N be even and write 



O)* O)* p/2 

E\X%(A)\ P = ^(aV-E^WlI 

a 1 aP j = l 



\ieF 



n 

J j=p/ 2+1 



(6.5) 



where we abbreviated 

a 3 := (a? : 1 ^ i ^ n) , a := (a^ : 1 ^ i ^ n , 1 ^ j : ^ . 

We now make the crucial observation that w(a) := u^a 1 ) . . .w(a?) is a weight on the set of indices (i,j); 
this is an elementary consequence of the Definition (4.4). In particular, X) a ^( a ) *S 1- 
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By Simplification (SI), we assume that all indices a are distinct: in addition to the constraint 



{/ii, . . . , /Ik}, we introduce into (6.5| an indicator function that imposes a\ ^ a\, if ^ 

We now make each Q xy independent of as many summation indices as possible using Family A identities. 
To that end, we define 



,(T) 



G 



(T) 



Using (3.13) iteratively, we expand every factor Q xy appearing in ( |6.5[ ) in all the indices 

a F := (al : i e F , 1 ^ j ^p) 



associated with a factor Q. Let Q xy be a fixed entry in (6.5 ). The idea is to successively add to Q xy as many 



upper indices from the collection &f as possible. The goal is to obtain a quantity satisfying the following 
definition. 



Definition 6.4. An entry G x T y or Q xy is maximally expanded in S if S C T U {a;, y}. In other words, a 



,(T) 



maximally expanded resolvent entry cannot be expanded any further in the indices S using (3.13) 



Along the expansion of each Q xy using (3.13), new terms in (6.5) appear; each such term is a monomial 



of entries of Q divided by diagonal entries of G. We stop expanding a term if either 

(a) all its factors are maximally expanded in a.p, or 

(b) it contains deg(A) + 2pn entries of Q in the numerator. 

The precise recursive procedure is as follows. We start by setting A := Q xyi where Q xy is an entry on the 



right-hand side of (6.5) 



1. Let Guv denote an entry in A and d an index in ap such that d ^ TU {u, v}. (This choice is arbitrary 
and unimportant.) If (a) no such pair exists, or (b) A contains deg(A) + 2pn factors Q in the numerator, 
stop the recursion of the term A. 



2. Using Family A identities, write 



if Guv is a resolvent entry in the numerator and 



g (1 

UV 



(Td) 



^ud ^ 'dv 



G 



(T) 
ild 



r (T) 



r (Td) 



^ ud 'du 
r (T) r (Td) r {T) 



(6.6) 



(6.7) 



if Guv = Guu is a diagonal resolvent entry in the denominator. This yields the splitting A = A' + A" , 
where both terms have the form of a product of entries in the numerator and diagonal entries in the 
denominator. Repeat step 1 for both A' and A" (playing the role of A in step 1). 

It is not hard to see that the stopping rule defined by the conditions (a) or (b) ensures that the recursion 
terminates after a finite number of steps. Indeed, the quantity "number of entries of + "number of upper 
indices" must remain bounded by the stopping rules (a) and (b). 
The result is of the form 

Qxy = ^ Ha + R , 
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where each summand H a is a fraction with entries of Q in the numerator and diagonal entries of G in the 
denominator, all of them maximally expanded in a^. Here the rest term R satisfies 

R ^ ,J,deg(A)+2pn ^ E ^2 ^ N C P , A (gg) 



for some constant C p .a- The first estimate of (6.8) follows from (3.16) combined with (3.11) and Lemma 
3.6 and the second estimate of (6.8) from (3.18) and (3.19). 



We then multiply the resulting sums on the right-hand side of (4.2) out to get 



(6.9) 



where Y£' a is a monomial and a a counting index ranging over some finite set. Each term Y£' a is a fraction 
with entries of Q in the numerator and diagonal entries of F in the denominator. Moreover, either (i) all 
entries of Y£> a are maximally expanded in or (ii) Yi' a -< ^deg(A)+2pn ^ 

arises if Y^' a contains 



one or more rest terms R). We now multiply out the expectation in (6.5) as 



E 





E E 



]\Qa- ]Yi 



P,a P 



\i£F 



(6.10) 



We plug this into (6.5 1 and pull out the summation over ax, ... , a p . This gives rise to the summation in 



(6.2), indexed by the set © = {(ax, . .. ,a p )}. If, for some (ax, . . . , a p ), one or more of Y^ ai , . . . , Y e 



p,a P 



IS 



not maximally expanded in ap, it is easy to see that 



(m)* (m)* 
^^(a 1 )...^^ 



\i£F 



\i£F 



(6.11) 



satisfies (6.3) and (6.4). Indeed, each ter m Yj '" 3 contains at least deg(A) entries of Q\ thus the trivial bound 
Y i> a i ^ ^dcg(A) always ho lds by Lemma 



(6.8), (3.18), and (3.19), we find (6.3) and (6.4 1 



3.9 



Using Lemma 3.6 we can multiply these estimates. Recalling 



It therefore suffices to consider products of Y^' Qj 's in (|6.10|) which are all maximally expanded in ajr 



(i.e. terms which are products of H^s only and not R's). The presence of Q's leads to the following crucial 
restriction on terms yielding a nonzero contribution to (6.10). For each i 6 F, we claim that at least one of 
y a 2 '" 2 , . . . ,Y£' a " is not independent of a}. This follows fr om the observation that generally E[Q a (X)F] = 
if Y is independent of a. More generally, we require that, for any i £ F and j = 1, . . . ,p, at least of one of 



Y L 



Y 



3, a j 



) 2 a 



P,a p 



is not independent of a\ (hat indicates omission from the list). This imposes a constraint on the terms that 
survive the expansion. 

Moreover, the term that is not independent of a\ contains at least one additional entry of G, since at 
some point the formula (6.6) or (6.7) had to be applied with d = a\ and the second term of (6.6) or (6.7) 
contains at least one additional entry of Q. Since we assumed Simplification (SI), i.e. all a^'s are different, it 
is a general fact that each Q gives rise to an additional off-diagonal entry of Q and contributes a factor A -< $ 
to (6.5). In other words, any X ai ... a yielding a nonzero contribution to (6.5) has at least p(deg(A) + \F\) 
entries of Q in the numerator. Recalling Lemma|3. 6|an d Lemma 3.9 we find that any term X ai ... a yielding 
a nonzero contribution to (6.5) satisfies (6.3) and (|6.4l). This concludes the proof of Lemma 6.3 □ 
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6.1. Graphical representation. The phenomenon behind the proof of Lemma 6.3 in fact has a simple graph- 
ical representation, which will prove essential for later, more intricate, estimates. Wc illustrate its usefulness 
by applying it to the proof of Lemma [6~3"1 We recall the basic graphical notation introduced in Section [3.2.2| 
The quantity whose expectation we are estimating, |X|J(A)| P = [Xp (A)] p ^ 2 [X^(A)] P ^ 2 , has a natural 
representation in terms of a multigraph, which we call 7 P (A) and which is essentially a p-fold copy of the 
graph A encoding Z. The graph j p (A) is obtained as follows. 

(i) Take p/2 copies of A and p/2 copies of A whose edges have inverted direction and colour arising from 
the relation Q a b = Ql a . More precisely, this inversion means that each edge e G -E'(A) gives rise to an 
inverted edge e' satisfying 



1 if£e = * 



a(e') = /3(e), 0{ef) = a(e) . 



(ii) For each external vertex i S V e (A) merge all p copies of i to form a single vertex. 

Note that the set F is not depicted in 7 P (A). The vertex set of 7 P (A) consists of summation vertices and 
external vertices (this classification is inherited from the vertices of A in the obvious way), so that we may 
write ^(7 P (A)) = K(7 P (A)) U K(7 P (A)). 

Definition 6.5 (Projection it). We introduce the p-to-one canonical projection it : V('y p (A)) — > V(A), 
defined as = j if i is a copy of j in the construction of 7 P (A). 




Figure 6.1. The graph 7 2 (A) that encodes E\X%(A)\ 2 , where Z aia2 = Q l _ il a 1 Qa 1 a 2 Ql 2li2 Ql ia2 Q^2ai- 



We start with the graph 7 P (A) (see Figure 6.1 1. We shall construct a set <5^.(A) of graphs, denoted by 
r, on the same vertex set V(j p (A)). The algorithm that generates & F (A) is precisely the one given after 



Definition 6.4 On the level of graphs, this algorithm consists of a repeated application of the graphical 



rules in Figures 3.5 and |3.6| (Note that the second identity of Figure 3.6 is also valid for Q instead of G, 
i.e. without the black diamonds.) As indicated in Figures 3.5 and 3.6 we keep track of the upper indices 
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associated with an edge by attaching a list of upper indices to each edge. The algorithm terminates when 
either all edges are maximally expanded in a.p or there are deg(A) + pn edges that do not bear a diamond 
(i.e. that contribute a factor ^). Indicating these upper indices may be more precisely implemented using 
decorated edges, but we shall not need such formal constructions. 

Rec all fro m Defin ition |3.10| that choosing the second graph on the right-hand side of any identity in 
Figures 3.5 and 3.6 is called linking (an edge with a vertex). As shown above, any graph T € ©^(A) 



whose edges are not maximally expanded yields a small enough contribution by a trivial power counting. 
In the following we shall therefore only consider the remaining graphs, i.e. we shall assume that all edges of 
r G 0^(A) are maximally expanded in . Moreover, the upper indices of an edge are uniquely determined 
by the constraint that the edge be maximally expanded: the entry encoded by the edge (x,y) is Q^f x . 
Thus, we shall consistently drop the upper indices associated with edges from our graphs T. 
We introduce some convenient notions when dealing with graphs in 0^,(A). 

Definition 6.6. Let T e ® P F (A)- 

(i) We denote the vertex set of T by V(T) — V S (T) U V e (T), where V S (T) denotes the summation vertices 
and V e (r) external (fixed) vertices. By definition, all three sets are the same as those of 7 P (A). 

(ii) We denote the set of edges of T by E(T); the set E(T) has a colouring £ : E(T) -> {1, *}. 



(iii) The p-to-one canonical projection n : V(r) — > V(A) is taken over from Definition 6.5 



Thus, &p = (cbi : i G it 1 (F)^j. In this manner we write the sum of maximally expanded terms on the 
right-hand side of ( |6.10 l as a sum of graphs T G 25^ (A). By definition, each vertex in 7r~ 1 (i ;l ) has been linked 



with at least one edge, possibly more. Each such linking adds an edge to the graph, and hence contributes 



a factor A -< ^ to its size. This concludes the graphical discussio n be hind the proof of ( I6TJ ). Figure 6.2 
shows two sample graphs from (S|,(A) for the graph A from Figure 6.1 where F consists of a single vertex 
associated with the summation variable a 1 . 



7. Chains 



In this section we derive the a priori estimate on chains, Proposition 5.3 



7.1. Step I x : chains with F ^ 0. Step Ii is an application of the simple high-moment expansion method 
from Section [6] It is formulated in the following proposition. In Section |7.3| it will be used in conjunction 



with Proposition |7.2| below to complete the induction and hence the proof of Proposition 5.3 



Proposition 7.1 (Induction Step Ii). Suppose that A ~< * for some admissible control parameter ^ , and 
let t ^ 2. Suppose that 

X$(A) -< ^j dc s( A )+\F\^c(A)~\F\ 
holds for any open chain A of degree strictly less than t, F = 0, and any adapted chain weight w. Then 



(7.1) holds for any open chain A of degree I, F ^ 0, and any adapted chain weight w. 

In this section we continue to assume Simplification (SI) (see the beginning of Section [6]). 
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Figure 6.2. Left: a graph obtained from the one in Figure 6.1 with \F\ — 1 by linking both vertices in slf = («i, a\) 
with an edge; a\ was linked with the edge (/X2, a\) and a\ was linked with (/xi, a}). This graph contains the minimal 
number of edges to yield a nonzero contribution after taking the expectation. Hence it is of leading order. Right: a 
graph obtained by linking two further edges with ai, namely the edges (a?, ^2) and (pi,af). Its size is subleading. 



Proof of Proposition 17.11 For simplicity of notation, we focus on the case where A is a directed chain of 
degree l\ the undirected case is proved in the same way. The argument is best understood in a representative 
example, 

■2a G p Liai G aia2 G a2a3 G asa4 G a4 f l2 , (^-^) 
which is encoded by the graph A depicted in Figure [7T] Let us take F = {1} and compute the variance 



yLtl dl «2 a 3 a i fj>2 

Figure 7.1. The graph A that encodes Z defined in (|7.2[). 



of Xp(A). In the following we use the terminology and notation of Section [6] without further comment. In 
Section y is was shown that the only graphs T e (3f, (A) that contribute are those in which the vertices ai 



and a\ have both been linked to some edge. Figure 7.2 shows such a graph T of leading order. Since the 



two vertices a\ and a\ have been linked, they each contribute a factor A -< ^ (two edges were added by the 



linking process). Now we break the graph in Figure 7.2 down to its chains, i.e. we freeze all those summation 



vertices, a\ and a\ , that were linked to. What remains is a collection of chains, each shorter than the original 
chain A. In this example there are four nontrivial subchains: 



a\, a\ -> a\ -> [i 2 , A*2 -> a\ -> a\ 



Moreover, the monomial encoded by each subchain lies either inside Q j(-)j inside Q a j{-), or inside neither. 
Thus the monomials encoded by the first two subchains lie inside Q a i(-), and the monomials encoded by the 
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Mi \ 




/M2 




Figure 7.2. Left: a graph F of leading order in &%(A) with A defined through \7.2\ , and F = {1}. We do not draw 
the loops that encode diagonal resolvent entries in the denominator. Right: the same graph broken down to chains. 



two last subchains inside Q a \{')- 

Now we may invoke the induction assumption (i.e. Proposition 5.3 for F = 0) on each of the four 
subchains. We use that they all have degree strictly less than deg(A). To be precise, before invoking 
Proposition 5.3 we have to get rid of the upper indices using ( 3.13[ ); see below for details. 

Moreover, we ignore some minor technicalities associated with coinciding indices. By Simplification (SI), 
we assumed that all summation indices of T were distinct. In particular, the indices associated with different 
subchains of T are distinct, which implies that the subchains of T are coupled. This coupling is manifested in 
the fact that summation indices within a subchain are subject to additional restrictions that are unrelated to 
that subchain: these summation indices cannot take on values of indices in other subchains. This means that 
summations cannot be performed independently within each subchain. Hence we may not strictly speaking 
invoke Proposition |5.3| for each subchain; in order to do so, we first have to decouple the subchains so as 
to get a product of terms associated with the subchains. In order to achieve this decoupling we have to 
allow indices associated with different subchains to coincide. This decoupling is a simple inclusion-exclusion 



argument whose details are postponed to Lemma 9.5 in Section 9.3 



Summarizing this example, we obtain an estimate of order \I/ 12 $ b for the graph depicted in Figure 



7.2\ Since, by Proposition 5.3 the contribution of an open subchain of degree d is ^> d ^ d ~ 1 , the four non- 
trivial subchains yield a contribution \l/ 3 $ 2 v[/ 2 (j) v]/ 3 $ 2 ij/ 3 <I> = \j/ 10 cj> 6 . There are also two trivial subchains, 
thus resulting in a total contribution <F 12 $ 6 . Another way to think about such estimates is to count the 
additional factors of $ and $ gained along the proof. The naive size of the original graph, before linking, 
was v[/ 2do s( A ) = \E» 10 since Z in (7.2 1 contains five factors and we consider its second moment (i.e. set p = 2). 
Since \F\ = 1, we gain an additional <F 2 I F I = <F 2 from the linking; this step increases the number of edges 
from 10 to 12 in the graph on the left-hand side of Figure |7.2| Moreover, we gain an additional $ factor 



from each internal summation vertex in the subchains, in this example we gain a factor $ from each of the 
six vertices al, a\, a\, al, a|, and a\. Thus we recover the bound \I> 12 <I> 6 . 

Let us now give the general argument, which is in fact a trivial generalization of the above example. We 
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start with a graph T € <8 P F (A), as constructed in Section [i] We split the summation indices a = (a', a"), 
where a' consists of the chain vertices of F. Thus, a" contains in particular the indices associated with 
vertices which have been linked to an edge. By the argument of Section [6j a" contains all indices of ap, so 
that |a"| ^ |a^| = p\F\. Since each linked vertex is incident to an additional edge resulting from linking, 
the graph T contains at least pdeg(A) + \ap\ > p(deg(A) + edges. So far we have simply repeated the 
argument of Section [6] and reproved the bound (6.1). 



In order to gain an additional factor <!> from each of the summation indices in a', we use the induction 
assumption. The assumption is used on open chains of vertices, i.e. subgraphs of T which are open chains. 
We fix a" and regard a' as the summation indices. Then T becomes a collection of open (sub)chains, and 
the vertices associated with a' are the chain vertices of these subchains. If we can ensure that each subchain 
has degree strictly less than A, we can apply the induction assumption to get an additional factor $ from 
each chain vertex in represented in a'. This will give us a bound of size 



itcu ill tx . _l 111a win gjivt; us d uuuuu ui siiu 
^pdcg(A) + |a"| $ |a'| ^ ^p(deg(A)+|F|) $ p(c(A)-|.F|) 



where we used that |a"| ^ p\F\, |a"| + |a'| = pc(A), and * $. 

In order to carry out this argument, we make the following observations. 

(i) All subchains of T have degree strictly less than deg(A). This property is crucial for the induction. It 
is a consequence of the two following facts. First, the linking of vertices never produces new subchains 
nor lengthens pre-existing subchains. Note that vertices in a" are fixed, and subchains terminate at 
them. Second, since F ^ 0, at least one vertex of every subchain of degree deg(A) in 7 P (A) will be 
linked to an edge, hence cutting the subchain of degree deg(A) into smaller subchains. 

(ii) The expression Z^ encoded by any subchain T' of L always appears in conjunction with a chain weight 
w'(h). This is an immediate consequence of the fact that the weight w(a 1 ) • • • w(a p ) is a chain weight 
by assumption. 

(iii) Let Z' b denote the monomial encoded by a subchain F' of T. Then any Q a has an index a in a" (i.e. 
is fixed), and acts either on all resolvent entries of Z h or none of them. 

In order to invoke the induction assumption, we still have to get rid of the upper indices in the maximally 



expanded resolvent entries. The procedure is almost identical to the one following Definition 6.4 but in the 



opposite direction. In particular, the key formula (3.13) should be viewed in the form 



r (T) r (T) n {T) n {T) 

r (Tk) = „(T) _ u ik ^kj 1 _^ , ^ik ^ki f 7 o\ 

« « n {T) ' n {Tk) n (T) ~T ( T ) n (Tk) n (T) ■ [ -'-°) 

^kk °M U « U ii ^kk 



We start removing the upper indices one by one using (7.3), and stop if either all upper indices have been 
removed or if the number of off-diagonal resolvent entries exceeds deg(A) + 2p£. The size of the latter terms 
is already sufficiently small by the trivial bound A ^ \&. As for the former terms, they are represented by a 
new (but still finite) set of graphs in which every vertex is either a chain vertex or has been linked with an 
edge. 

Now the induction assumption is applicable to each subchain, and the proof is completed by invoking 



Lemma 3.6 (Note that as before we ignored issues related to coinciding indices according to Simplification 



(SI); these are dealt with using the inclusion-exclusion argument of Lemma 9.5 ) □ 
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7.2. Step I2: chains with F = 0. Step I2 is completed in the following proposition. 



Proposition 7.2 (Induction Step I 2 ). Suppose that A -< * /or some admissible control parameter and 
let I J? 2. Suppose that 

X%(A) -< -$ dc g( A )+l F l$ c ( A )-l^l (7.4) 
holds /or anj/ open chain A 0/ degree I, any F 7^ 0, and any adapted chain weight w. If £ ^ 3, suppose in 



addition that (7.4 1 /10/ds /or any open chain A 0/ degree strictly less than I, F — 0, and any adapted chain 
weight w. 



Then (7.4) /10/ds /or any open chain A 0/ degree I, F = 0, and any adapted chain weight w. 



Proof. As before, we focus on the case where A is a directed open chain; the proof in in the undirected 
case is the same. Thus, 

-2-ai---a„ — G ^ iai G aia2 ■ ■ ■ G an ^ 2 , Wb(&) = U](&) = S ai b 1 • ■ ■ Sa„6„ 

for some b = . . . , 6„}. (Recall that Gy = for i ^ j.) Note that n — deg(A) — 1. We have to prove 
that 

Xj"(A) -< * n+1 $". (7.5) 



(Here we used that c(A) = n.) The main idea of the proof was given in Section 3.2.3 derive a stable 
self-consistent equation whose error terms may be estimated using the induction assumption. We subdivide 
the proof into six steps. 

To simplify notation, throughout this proof we use £ = £(h, /xi, ^2) to denote a random error term 
satisfying £ -< ^ n+2 ^ n ~ 1 . Like generic constants C, these error terms may change from line to line without 
changing name. 

Moreover, in order to keep the presentation more concise, we shall sometimes ignore unimportant sub- 
tleties arising from coinciding summation indices. These complications are harmless and will be dealt with 
precisely using the inclusion-exclusion argument of Lemma |9.5| The general philosophy is the following: if 
we constrain a pair of indices to coincide instead of being distinct, we lose at most two factors of Indeed, 
we lose at most one chain vertex (resulting in a loss of $ ^ *S>) , and at most one off-diagonal entry of G may 
become diagonal (resulting in a loss of \p). Note that, since A is an open chain, at most one off-diagonal 
entry may become diagonal when setting two summation indices to be equal. (This is not true for closed 



chains; see Section 7.3 ) This loss of "i" 2 is compensated by the factor M 1 ^ $ 2 we gain from the reduction 



in the number of summation variables. 

Step (i). We introduce a factor P ai into the summation in Xq(A). We find 

X$>(A) = w(a) P ai Z ai ... an + Xf 1} (A) 

a 

= ^ w(a) P ai Z ai -a„ +£■ 
a 

where the second equality follows from the induction assumption. 

Step (ii). We introduce a factor rn 2 /(G 0l0l ) 2 in front of X^(A); this prefactor will be important in the 
fourth step below, as the factor l/(G aiQl ) 2 will be used to cancel diagonal resolvent entries arising from two 
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applications of the identity (3.14a). We find 



O1M2)* O1M2)* 
^2 w(a) P ai Z ai ... an = ^2 w ( a ) P ai 



{G aiai - m) 2 + 2m(G aiai - m) + m 2 



■7 

^ai---a n 



= y ] s a 1 b 1 Pa 1 
ai 

(a 1 ^ 1 fj, 2 )* 

X 

a 2 ,...,a 



{G aiai ) 2 

(G aiai - m) 2 + 2m(G aiai - m) + m 2 



(Ga-ia-i) 2 

^ Sa 2 b 2 ' ' ' Sa n b n G aia2 • • • G an ^ 2 



For n = 1 (i.e. deg(A) = 2) the last line is understood to be G aiM2 . 

We now show that the only the term m 2 in the numerator is relevant. By induction assumption and 
Lemma 13.61 we find that 



(aiMl/^)* 



^ Sa 2 b 2 ' ' ' Sa n b n G ai a 2 ' ' ' Ga nf i 2 ~< ^ n $ T 



(7.6) 



(Note that the induction assumption is only used if n 2; for the initial value n = 1 (7.6) is trivial.) Using 



Eai s aihi ^ 1, (3.17), and Lemma 3.6 again, we get 



JCJT(A) = X w (A) + £, * fl «(A) := X! u '( a ) p ^ 



2 ' "In 



(7.7) 



Step (iii). We make all resolvent entries which do not contain the index a\ independent of a\ using (3.13 ). 
Thus, we assume that n ^ 2; if n — 1 there is nothing to be done and this step is trivial. Using (3.13) we 
find 



(M1M2)* 

X$>(A) = £ w(a)P ai 

a 

= £ W (a)P Ql 

a 

= ^ W (a)P ai 



\G ai ai ) 

(G ai ui ) 



n ( n{a\) 1 Ga 2ai G aia3 \ 
2 u Ml''l"<'l''2 I "0203 T ^ I 



G (13(14 ' ' ' G ar 



(G aiai ) 



2 < - r A'l a l^ Ta l a 2'- T a203'- T <l3<l4 



«M2 



Here the bound on the error term 
O1M2) 

^2 S aib 1 Pa 1 
ai 



{G ai ai ) 3 



(ai/ii/i2)* 

GfiiCti ^ ^ Sa 2 b 2 ' ' ' Sa n b n G ai a . 2 G a2ai G ai a3 G a ^ a ^ • • • G aj 
a 2 T-.,a n 



/'2 



follows by first fixing the summation index a\ and using the induction assumption combined with Lemma |3.6[ 
similarly to Step (ii) above. The induction assumption is used on two chains: one of degree 2 (corresponding 
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to G ai a 2 G a 2 a 1 ) and one of degree n — 1 (corresponding to G aia3 G a3a4 ■ ■ ■G anll3 ); here 01 is regarded as 
an external index. (Here we swept under the rug a minor technicality. Strictly speaking, the expressions 
encoded by different subchains do not factor, since their summations are still coupled by the constraint 
a 2 ^ {ai , 03, 04, . . . ,a„}. As outlined above, we ignore such complications here; they are dealt with using 
the inclusion-exclusion argument from Lemma [93] in Section |9.3| by introducing a partitioning on the values 
of d2, which results in a decoupled expression plus a series of small error terms.) 
Next, we write 



O1M2)* 
E Ma)P ai 



m ~ 



(Cfliai 



, 2 ^"Mi ai LJ aifl2 0,2(13 yjJ a 3 a^ *- r a4a5 a 7 



(M1M2)* 

E w (*) p ^ 



\G ax ai ) 



2^ fJ.iai KJ a 1 a2^ r a 2 a 3 K - T a 3 a4, y - J a i a 5 



K, (7.; 



where the error term is 

K := E w ( a )^ 
a 



rn 



c r 1 ^ 1 ) 3 °' Gg 1 a 4 „ 

J a\ ai 

G a<zai_G txms \ G a3ai G aiaA 



(•/O \2 "l"! «iii3 /^i ( - r a 4 a 5 ' ' ' <- r a„^ 2 



(G aiai )' 



'G ^i iai G ai a 2 I G a2a 



Ga\a\ 



Gaiai 



G tiACLr, ' ' ' Ga n 



We may estimate this exactly as above, by first fixing a\ and regarding it as an external index. The induction 
assumption allows us to estimate the two resulting terms 

^ ^a 2 b2 ' ' ' Sa n b n G a ^ a2 G a2 a 3 G a 3a ^G a ^ a4 G a4 a^ • • • G an ^ 2 

a2,...,a T1 

(a product of two subchains) and 

(ai/ii/j 2 )* 

^ ^a2&2 ' ' ' ^o, n b n G aia2 G a2ai G aia3 G a3ai G ai a 4 G a4 a 5 • • • G an ^ 2 , 

(a product of three subchains). Note that the induction assumption is always used on subchains of degree 



an £ . Continuing in this manner, we eventually get 



strictly less than deg(A) = n + 1. Using Lemma 3.6, one therefore finds that 1Z in (7.8) can be replaced with 

(7.9) 



O1M2)* 



(G aiai )• 



G G ri( a i) n( a i) ...ni^i) 

. < - T a 1 a 2 ^0203 ^a 3 a 4 a n fj, 2 



+ £. 



Step (iv). We apply the identity ( 3.14a| ) to both resolvent entries with lower index a±. This yields 



(M1M2)* 



E w (*)p*i 

a 

O1M2)* (ai) 



(ai) 



EG (ai) h^ h j/(7 (ai) r7 (ai) (7 (oi) • • • (7 (ai) 
^Vid n da 1 it>a 1 d"~' a -'a,2 ^a 2 a 3 ^0304 Cr OnM2 

d,d' 

E E ™( a ) w ^Ti^^ 1 ^^ • • • gi-) 2 + £ , 



(7.10) 
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where in the second step we used that P ai (hd ai h ai d') — $dd'S ai d, and that all resolvent entries are independent 
of a\. Renaming [a\,d) > (d,ai) and interchanging the order of summation, we find from (7.7) and (7.101 



G^G^-G^+S. (7.11) 



ai a 2 ,...,a 7i 



Step (v). Having performed the expectation, we now get rid of the upper indices d in all of the resolvent 
entries of ( |7.11| ) to go back to the original resolvent entries. To that end, we write 



r<(d) Md) _ r<(d) -In 

KJr Hia 1 K:r a 1 a 2 a rl fj, 2 I "Ml"! 



G^dGdax \ ( ^ G ai dGda 2 



G 



G. 



dd 



G an , 2 - Ga "f d ^ ) ,7.12) 



G 



dd 



in the summand of (7.11), and multiply everything out. As before, each term results in a collection of 



subchains of degree strictly less than deg(A) = n + 1, to which the induction assumption may be applied. 



More precisely, suppose that we have chosen the second term in k ^ n + 1 of the factors in (7.12 ). Then we 



get k additional resolvent entries of G (yielding a total of n + k + 1) , as well as a collection of subchains whose 
total number of chain vertices is n — k. Notice that the index structure of every terms after multiplying 



(7.12) out is chain-like. The result is 



GuiA»2) (dfj, 1 ^ 2 )* 

X$ '(A) = m 2 Sfoid s datSb 2 a 2 ' ' ' s b n a n G t j, iai G aia2 ■ ■ ■ G antl2 + £ . (7-13) 

d ai,...,a„ 

(As before, we ignore the issues related to the cases a\ e {/ii, p2, d2j • • • > & n }; see the inclusion-exclusion 
argument of Lemma 9.5 ) We have the rough bound 



(dfJ,ifi 2 )* 



Sda 1 S b2 a 2 ---S bnarl G lliai G aia2 ---G arlll2 < #" +1 $ n l . (7.14) 

(ii , .. . ,a 71 

Indeed, by fixing a\ and using the induction assumption on the subchain of degree n that encodes the 



the summation m (7-13) by J2d U P to an crror 01 type £ . Finally, we may replace the sum Y^a^^l 



expression G ai a 2 ■ ■ ■ G anll2 , (7.14) follows by Lemma 3.6 It follows using s bl 4 ^ M 1 that we may replace 



(djUl£t2)* 



in (7.13) by a* U P ^° an error type £ , by the inclusion-exclusion argument of Lemma 



9.5 



OU'2)* 

X${A) = m 2 y ] S bl d Sd ai Sb 2 a 2 ' ' ' Sb n a n G t j /iai G aia2 ■ ■ ■ G an ^ 2 + £ . 

d a\,...,a n 

Step (vi). We fix b%, . . . , b n and regard b\ as a free index. Define 

(Ci»)" 

X <£(A) = V bl := Y w bi6 2 ...b„( a ) Z ai...a„ ■ 



The result 



(7.15) 



Then ( 7.15 1 reads 



v bl = (m 2 Sv) bl + £ bl 



(7.16) 
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where £b x -< \p n + 2 $™ _1 . (Here we use the notation (&2, ■ ■■,b n> fii, ^2) = £(b\, . . . ,b n , (ii, (i 2 ), indicating 
that b\ is the variable index and all other indices are fixed for this argument.) Inverting the self-consistent 
equation yields 

v bx = {{l-m 2 S)- l £) bi . 

In order to complete the proof, we observe that if X!? -< ^> uniformly in (j, k), then for any matrix A = (Aij) 



we have J^jAijXj -< \ \ A\\e v[/ uniformly in (i,k). Recalling the definition (3.2 1, we therefore get 
Xy(A) = ((l-m 2 S)- 1 £) bi ~< g^™+ 2 $"- 1 < 

The last inequality is valid only if $ < 1, but the final bound is still correct even if <f> = 1 by using the trivial 
bound Xff(A) -< * n+1 . This concludes the proof. □ 

7.3. Completion of the induction and the proof of Proposition 5.3l We may now complete the proof of 



Proposition 5.3 We begin with open chains. As outlined in Section 15] the proof is by induction on deg(A). 
The induction is started with the trivial open chain A corresponding to Z — G^„, for which we have the 
trivial bound G^ -< 'J. Then (5.2) for an arbitrary open chain A follows by induction, using Propositions 
OandOl 



In order to prove (5.2) for an arbitrary closed chain A, we follow almost to the letter the arguments from 



Sections |7.1| and |7.2| The proof consists of two steps, each repeated twice. 



(a) Prove (5.2) for F ^ 



(b) Prove (5.2) for F 



The order of the argument is as follows. First we do step (a) for closed chains with one external vertex, then 
step (b) for closed chains with one external vertex, then step (a) again but now for closed chains with no 
external vertex, and finally step (b) for closed chains with no external vertex. Here no induction is required; 



the necessary input is (5.2) for arbitrary open chains. Each one of the four above steps uses the previous 



ones as input. The proof of either step (a) is almost identical to that of Proposition 7.1 and the proof of 
either step (b) almost identical to that of Proposition 7.2 The only nontrivial difference is associated with 
coinciding indices, where we may lose a factor \[' 2 $ 2 if two indices coincide. Here the worst case is the closed 
chain of degree two with no external vertices: ^ ab s^ a s l ,bG a bGba- The associated monomial G a bGb a is of 
degree two and has two chain vertices. However, setting a = b yields a contribution of order M _1 , i.e. we lost 
a factor VP 2 (from the two off-diagonal entries that became diagonal) and <I> 2 (from the two chain vertices). 
However, this loss is compensated by the gain M~ l : we get the bound * 2 $ 2 + Af" 1 < 2* 2 $ 2 . 

Let us sketch the general cases. Consider a closed chain A with no external vertices. Thus, A has 
c(A) = deg(A) chain vertices. If we ignore coinciding indices, we get the bound \I/ de s( A )$ do g( A ) on its size. 
On the other hand, if all of the deg(A) indices coincide, we get the bound M~ do s( A )+ 1 . Indeed, all but one 
of the entries of S in the chain weight can be estimated using their maximum M _1 ; moreover, all resolvent 
entries are diagonal and hence of size 1. This yields the combined bound 

v j / dcg(A) < j,dcg(A) _|_ jy- dcg(A) + l ^ v j / dcg(A) < j,dog(A) ^ (7-17) 



where we used that deg(A)/2 ^ deg(A) - 1. This is ( |5T2 ). 

The case of a closed chain A with one external vertex is similar. In this case we have c(A) = deg(A) — 1. 
Ignoring coinciding indices, we get the bound \I> do s( A )(I> dc g( A ) _1 . On the other hand, if all indices coincide 
we get the bound M~ de g( A ) +1 exactly as before. This yields the combined bound 



^deg(A) $ dcg(A)-l + M -dcg(A) + l x ^,dcg(A) $ deg(A)- 



(7.18) 
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where we used (2.9). This is (5.2). 



Note that in the bounds (7.17) and (7.18) we only considered the two extreme cases: when all summation 



indices are distinct, and when they all coincide. In Lemma |9.4| we prove that these bounds in fact cover all 
possible index configurations. The full details on coinciding indices are given in Section [9.3| This concludes 



the proof of (5.2), and hence of Proposition 5.3 



8. General monomials and vertex resolution 



In this section we conclude the proof of Theorem |4.8| for general A under certain simplifying assumptions. 
A sketch of the argument presented in this section was given in Section [3~2~4} the key new concept is that of 
vertex resolution, which relies on Family B identities. 

Our starting point is the graphical expansion from Section [6] as well as the chain estimates from Propo- 



sition 5.3 Throughout this section we assume Simplification (SI) from Section 



6 Moreover, we shall tacitly 



make use of Lemma 13.61 as well as the notations and definitions from Section [6 

In order to perform the vertex resolution, it will prove necessary to expand all resolvent entries in all 
of the summation indices a instead of the smaller set a F (as was done in Section [6]). Thus, as first step, 
we repeat the construction of Section [6] we start with the graph 7 P (A) that encodes the p-th moment of 
X F (A), and perform the expansion given after Definition 



6.4 



except we now expand in the full set a of 
summation indices instead of ap. This gives rise to a family of graphs which we denote by <& F (A). Note 
that & F (A) D 05 F (A), where, we recall, the set <5 F (A) is the set generated in Section [i] by expanding in the 
indices only. Thus, each graph T € & F (A) encodes a monomial of entries of G, the edge (x, y) giving rise 



to the maximally expanded entry Q, 



(a.\{x,y}) 



yxy 



depending on its colour. Here, and throughout 



the following, we use the phrase maximally expanded to mean maximally expanded in a (see Definition 6.4). 
As in Section [6j we do not keep track of the Q's in our notation. (Indeed, this information will turn out to 
be unimportant for our proof.) 
Note that Definition 



r e © p P (A) 



6.6 



carries over verbatim for T 6 



P (A). 



For the following we pick and fix a 



p\l-lj. Thus, r encodes a monomial, whereby each edge of T gives rise to a maximally expanded entry 
of Q. As explained in Section |6j the linking procedure used to make all entries maximally expanded ensures, 
thanks to the presence of the Q's, that \E(T)\ ^ p(deg(A) + \F\). 

The main idea of vertex resolution already appeared in Section 3.2.4 Roughly, we resolve each summation 
vertex using the Family B identities (3.14a) and ( 3.14b[ ), which results in a new set of summation vertices 
which we call fresh and draw using white dots. We call the resulting graph O. (More precisely, from each 
graph r we get a finite family of new graphs {O a }-) Next, we take the expectation, which results in a 
summation over all pairings (in fact, more generally, over all lumpings) of the white vertices adjacent to 
the original summation vertex. Each pairing gives rise to a new graph, which we call T. (As above, from 
each graph we get a finite family of new graphs {T Q }.) Although each steps results in an increase in the 
number of graphs, it is easy to see that this combinatorial factor is bounded by a constant depending only 
on |V(r)|, the number of vertices in T. In other words, the above families {Q a } and {T a } are finite and do 
not depend on N. 

The step T i— ► is performed in Section |8.2| and the step h> T in Section |8.3| Figure |8.1| contains 
a summary of this process, on the level of a single vertex, which is helpful to keep in mind while reading 
the following. The idea is that, provided the vertex being resolved arose as a copy of a charged vertex (see 
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Definition 4.7), the resolution process will (in leading order) generate at least one fresh chain vertex. From 
this chain vertex we shall gain a factor $ by invoking Proposition 5.3 and this will conclude the proof. 




Figure 8.1. A graphical overview of vertex resolution on the level of a single vertex. (We do not draw the other 
vertices.) In the second step we draw only one of the possible six pairings (in the Hermitian case). 



S.l 



Note that the notion of charged vertex can be lifted from A to T using the projection it (see Definition 
below). The following class of vertices, called marked vertices, plays the central role in this section. 
Informally, a marked vertex i e V S (T) is a charged vertex that, in the construction of F from 7 P (A), was 
linked to by the smallest allowed number of edges (zero if ir(i) ^ F and one if ir(i) 6 F). 

Recall that we intend to gain an additional factor $ from any charged vertex of T. However, the possibility 
of doing this may be destroyed if a charged vertex was linked to in the construction of T from 7 P (A). In this 
case we gain from the linking, as always, but we may not additionally gain from the vertex's being charged. 
If the vertex was excessively linked (i.e. more than minimally required), then we have the additional gain 
from this extra linking. Marked vertices are exactly those charged vertices which have been minimally linked. 
Thus, in order to gain from a marked vertex we cannot use the simple power counting that underlies linking, 
but need the more refined mechanism of vertex resolution. 

Definition 8.1 (Charged and Ma rked vertices in r). The set of charged vertices of T is by definition 
V C (T) := tt-^V^A)) (see Definition E7r}. 



The vertex i 6 V C (T) is called marked if either 

(i) 7r(i) ^ F and deg r (i) = deg A (7r(i)), or 

(ii) € F and deg r («) = deg A (7r(«)) + 2. 

We denote the set of marked vertices by V m (T) C V C (T). 

Note that (i) corresponds to the case where i was not linked to at all, and (ii) to the case where i was 
linked to exactly once. The following lemma gives a lower bound on the number of edges of T. Informally, it 
states that if i is not marked but n(i) is charged then i was linked to at least once more than the minimum 
required amount (zero if ir(i) ^ F and one if Tr(i) € F). 

Lemma 8.2. We have the bound 

\E(T)\ > p(deg(A) + \F\) + \V C (T)\ - \V m (T)\ . (8.1) 
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Proof. By definition of V C (T) and V m (T), we find that i G V S (T) was linked to at least once if 

% g 7 r- 1 (F c )nK(r)nKn(r) c or i g tt 1 ^) n (y c (r) n v m (r) c ) c , 

and i was linked to at least twice if 

i g 7r _1 (F) n v c (r) n v^r) . 

Since each linking adds an edge to the pdeg(A) edges of 7 P (A), we find 

\E(T)\ > P deg(A) + | 7 r- 1 ( J F lc )nK(r)nt/ m (r) c + ir- x (F) n (V C {T) n V m (T) c ) 
7r- 1 (F)nv c (T)nv m (r) c 

= pdeg(A) +p|K(A)| - |F ro (r)| , 

where in the last step we used that 7r is p-to-one and V m (T) C V'c(r). 



□ 



The goal of this section is to gain an extra factor $ from each marked vertex of T using vertex resolution. 



Provided we can do this, the proof of Theorem 4.8 will be complete. This can be informally understood 



as follows. In order to get the estimate (4.7), we need a bound of size \J/P( do g( A )+l F l)$pl 1/ ';( A )l . Each vertex 



i G V^(r) contributes factors "J and $ (in addition to the trivial pdeg(A)) to the estimate as follows. 

(i) 7r(i) ^ F and i ^ V C (T). In this case i yields no factor ^> or $. 

(ii) 7r(i) ^ F and i G 14(r). If deg r (i) = deg A (7r(i)) then i is marked and will yield a factor $ by vertex 
resolution. If deg r (i) > deg A (-7r(i)) then i is not marked but carries an extra factor "J/ since it has 
been linked to more times than needed. (Thus, T has at least one extra edge, corresponding to an 
off-diagonal entry G uv -< VP, incident to i). 

(iii) n(i) G F and i ^ V^(r). In this case i has been linked to at least once and is consequently incident to 
at least one extra edge. This yields a factor 4". 

(iv) 7r(i) G F and i G V c (r). As in (iii), t has been linked to at least once and hence yields a factor ^. In 
addition, i yields a second factor $ as follows. If deg r (i) = deg A (7r(i)) + 2 then i is marked and will 
yield an extra factor $ by vertex resolution. If deg r (i) > deg A (7r(i)) + 2 then i has been linked to at 
least twice, hence yielding a second factor 'J. In either case the vertex i generates a factor 5'<I > . 

Thus, from each case (i) - (iv) we gain factors \& and factors <E> in addition to the trivially available 
pdeg(A) factors of "J, where the values of and l$> is as follows: (i) 1$ — 1$, — 0, (ii) £y = 0, = 1, (iii) 

= 1, U = 0, (iv) e 9 =e* = 1. From this the bound ^p(^s(A)+\f\)^p\v c (A)\ f rj ows immediately. 

Before moving on to the main argument of this section, we outline how a marked vertex yields an 



additional factor $. We claim that Definitions 4.7 and 8.1 imply that 



i G V s (r) is marked 



•2) 



(see Definition 4.6). To see this, let i be an arbitrary marked vertex. If tt(i) ^ F then by Definition |8.1|i has 
not been linked to, and hence isf(T) 



^L^(A) for £ = 1, *. Therefore ( |8.2[ ) follows from Definition 



the other hand, if G F then by Definition 8.1 i has been linked to once, and cither (a) Vi(T) 



4.7 



MO 



On 

(A) 
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and v*{T) = u* (i) (A)+2 or (b) Ui(T) = i/ w(i) (A)+2 and i/?(T) = < W (A). Either way, Definition O yields 



12) 



Roughly, vertex resolution splits each summation vertex of degree 2d (we assume for simplicity that the 



vertex is of even degree) into d fresh summation vertices of degree two. If the right-hand side of (8.2 1 holds 
(as it does if we are resolving a marked vertex), at least one of the fresh summation vertices will be a chain 
vertex (i.e. both of its incident edges will have the same colour). The desired gain of a factor $ will then 



come from an application of Proposition 5.3 



8.1. General graphical representation. Throughout Sect ions [8] and [9j we shall fix a and apply three algebraic 
operations to the monomial encoded by T: 

(i) Family A identities, 

(ii) Family B identities, 

(iii) partial expectation in a. 

In order to keep track of the structure of the ensuing terms, we make heavy use of graphs. For pedagogical 
reasons, we shall develop the algebraic and graphical languages in parallel. Each algebraic expression (a 
monomial in the matrix entries of G, Q, H, and S) is represented by a graph. Application of one of the three 
elementary algebraic identities listed above can, as before, be described by a elementary transformation on 
graphs. Although the entire argument could be stated in terms of graphs alone, this would rather obscure 
the underlying mechanism, which always corresponds to applying one of the three algebraic operations listed 
above. Instead, we introduce each graph operation when it naturally arises in our argument. In order to 
obtain a set of graphs that is closed under all of the operations we shall need, we extend our set of graphs 
according to the following definition. 

Definition 8.3 (General graph). By a graph we mean a quintuple (Vf,V s , V e) E,£) with the following 
properties. The set E is a set of edges on the vertex set V ■■— Vf UV S U V e . Multiple edges as well as loops 
are allowed. The colouring £ is a map 

£ : E — ► {solid, dashed, dotted, wiggly} . 

As in Definition |4.1| we sometimes use the alternative notations solid = 1 and dashed = *. 

An edge that is solid or dashed is called a resolvent edge. Dotted and resolvent edges are directed, while 
wiggly edges are undirected. The vertices in V s are called the original summation vertices, in Vf the fresh 
summation vertices, and in V e the external vertices. Vertices in Vf are drawn using white dots and vertices 
in V s U V e using black dots. 

Figure [5~8| contains the dictionary of the colour-code: a solid edge encodes an entry of G, a dashed edge 
an entry of G* , a dotted edge an entry of H, and a wiggly edge an entry of S. More precisely, each resolvent 



edge encodes a maximally expanded (in a) resolvent entry (sec Definition 6.4 1, and each dotted edge an 
b.- admissible entry of H, which is the subject of the following definition. 

Definition 8.4. The entry h uv is an a-admissible entry of H if u £ a or v E a. 



The arguments below consist of a series of operations on the set of graphs from Definition 8.3 To be 



completely precise, below we shall in fact adorn the graphs from Definition |8.3| with decorations: resolvent 



loops may be decorated with a black or white diamond (see Figure 3.3), wiggly edges with an arbitrary 
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number of crossing strokes, and original summation vertices with an arbitrary number of rings. (The latter 
two concepts are defined above Definition 8.5 and in the beginning of the proof of Lemma 9.3 respectively.) 
To each vertex i £ V(T) of a graph T we assign an index Ui. We shall consistently use the splitting 

u = (ui) ieV (r) = (x,a,/Li), x = (xi) ie v t (r)> a = ( a i)iev a (r) , M = (M»)tev e (r) • 

We say that the index Ui is associated with the vertex i, and vice versa. We introduce the notation 

A(T) = A a , x (T) (8.3) 

for the monomial (in the entries of G^ T \ Q^ T \ H, and S where T C a) encoded by the graph T. Note that 
_4(r) has an explicit formula, analogous to (4.2 ) except needing much heavier notation. As we shall not need 
it, we shall not give it. (Note also that our graphs do not keep track of any factors of Q, as we shall not 
need this information.) 

8.2. Generation of the fresh summation vertices. Next, we define the vertex resolution operation precisely. 
Our starting point is a fixed T € ©^(A). In order to streamline the argument, we at first make the following 
simplifying assumption on A, which is removed in Section [9] 

(S2) There are no diagonal entries Q aa = G aa — m in -Z(A). (I.e. A has no loops.) 

The operation of vertex resolution consists of two main steps: the generation and lumping of fresh summation 
vertices. The idea behind the first step - the generation of fresh summation vertices - is to resolve, using 



the Family B identities (3.14a| and (3.14b), each (already maximally expanded) off-diagonal resolvent entry 
G'l^M"' 1 '^, with u ^ v, in any summation index from the set {u,v}. (The word "resolve" here refers to 
explicitly identifying the dependence on all matrix entries hij with i,j G a so that partial expectation in 
these variables can be taken. After taking the partial expectation, we shall get expressions that can again 
be represented by admissible graphs.) More precisely, we write 



-G 



Q(a\{u,v}) 



(a\{ti» W a ) , r \a.. 



(a) 



r (a\M) 



V^(a) ^«(a) i 
2_,x n - 



V (a) h 



,-(a). 



(u summation and v external) 
(u external and v summation) 
(u and v summation) . 



(As stated after Definition 4.1 we exclude the trivial case where both u and v are external indices, 
proof of 



(8.4) 



The 



.4) is a straightforward consequence of the identities (3.14a) and (3.14b), and the fact that h/ii 



by definition of Xp(A). For example, if a and b are summation indices and fi and v are external indices, we 
write 



(ab) 



Mb)r< r<{ a ) n( a )* n(b) n(b)* n( ab )h h G^h uhx. n( ab )*h n( ab ) n( ab )* h 

K - r aa K -'aa K - r bb bb ^aa^aa / t "'xa">ay'^'yz "zo' 'on" U fi "«» u m/ ^ vw "-wa 



x,y,z,u,v,w 



{ab) 



f-i(b) r-i (~t{a) /~t(a)* f~i(b) /~t(b)* \ s r^{ab)i t i /^( ab )*/i /^( ab ) f~<(ab)* l (q r\ 

"oo *- T oo < - r bh ^bb ^aa^aa / 4 " fix n xa n abl''bu'~ r u^ "o» u ui/ ' vw n wa ■ V°-' J / 



Notice that along this procedure we may generate non-maximally expanded diagonal terms, (e.g. G aa above), 
but off-diagonal terms always have upper indices a. Moreover, all entries of H on the right-hand side of 



(8.5) are a-admissible. 
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At this point we make the following further simplification, which leaves the essence of the argument 
unchanged but removes some technicalities. 



(S3) We replace any diagonal term G aa with m and any diagonal term G aa 



with to. (Recall that G aa ~ m 

in the sense that G aa — m -< W by definition of A.) This replacement is done in two places: in A(T) 



and in the identities (3.14a) and (3.14b) which underlie the algebra of vertex resolution. 



Again, Simplification (S3) is removed in Section [9j Thus, under Simplification (S3), we neglect all diagonal 
terms in (8.5 ). (More precisely, we replace each one of them with to or to; the resulting powers of to and to 
are irrelevant for estimates by ( 3.11| ).) 



The use of graphs greatly simplifies the analysis of complicated expressions like (8.5). The identities (8.4) 



all have obvious graphical representations. In Figure 8.2 we give a graphical depiction of (8.5). (Recall the 



conventions introduced in Figure 3.8 ) In the typical case, the summation vertex of degree four associated 



with a gives rise to four fresh summation vertices, associated with x, y, v, w; likewise the summation vertex 
of degree two associated with b creates two fresh vertices associated with u and z (first term in the right side 
of (|8.5|)). Due to presence of the term h uv in (8.4), sometimes two summation indices are directly connected 



with a dotted line at the expense of one fewer fresh summation vertex adjacent to each of these two indices. 
This results in the second term on the right-hand side of (8.5), which contains a factor h a b (note that G a b 



plays the role of G%\ {u ' v} from fl8.4|)). 





Figure 8.2. The vertex resolution from (8.51. The graph V is represented on the top and both graphs of 9t(r) are 
represented on the bottom. In accordance with Simplification (S3) we do not draw the diagonal entries of G. 



is no different from the above 



The generation of fresh summation vertices for a general T G 
example. Applying the graphical rules associated with (8.4) to each edge of V, we get a finite family of 



graphs which we denote by 9t(r). In accordance with Simplifications (S2) and(S3), in this section we drop 
all diagonal terms, and hence all loops from the graphs in *H(r). (In Section M below we keep track of the 
loops, which will lead to the larger set 9t.) 

Any graph € 9i(r) has the following properties. 
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(i) has only straight, dashed or dotted edges but no wiggly edge (see Figure 3.8). 

(ii) has no loops or multiple edges. 

(hi) The sets V S (Q) = V S (T) and V e (0) = V e (r) remain unchanged, as do the associated indices a and fi. 
In addition, we now have a new set of vertices, V/(0) ^ 0, which indexes the fresh summation vertices 
x. 

(iv) Each resolvent edge of encodes an entry of G" or C?( a )* in .4(0) (in particular, no resolvent edge 
of is incident to V S (Q)). Each dotted edge of encodes an a-admissible entry of H in .4(0). 

(v) For i,j G V S (Q) let the symmetric function a(i,j) denote the number of dotted edges joining i and j. 
The number of edges in T and is conserved in the sense that 

\E(T)\ = £ (1(& = 1) + 1& =*)) + l E ( 8 - 6 ) 

Informally: in the process r i— > that generates fresh summation vertices, each resolvent entry either 
remains a resolvent entry or is replaced with an entry of H with original summation vertices. This 



simply corresponds to the two terms in the last line of the right-hand side of (8.4). 

(vi) Each original summation vertex is incident only to dotted edges. 

(vii) Each fresh summation vertex has degree two and is incident to precisely one dotted edge. 

We remark that, by construction, the sets x and a are disjoint, as are the sets a and fx. However, x and 
IX are in general not disjoint, and indices of x may coincide. 

8.3. Lumping of the fresh summation vertices. We now take the expectation of .4(0). In fact, all that 
we shall need is the partial expectation in a. The key observation is that, by Property (iv) in Section 



8.2 each resolvent entry of .4(0) is independent of a and each entry of H is a-admissible. In particular, 
the partial expectation Jlaea ^ a ac thig on A{&) acts on the product of the entries of H alone. Since this 
product is an explicit monomial, we can evaluate its expectation directly. If the random variables h uv were 
Gaussian, this would correspond to a simple Wick-pairing of the dotted edges. Pairing two dotted edges, 
each of them incident to a fresh summation vertex and a common original summation vertex, results in 
a pairing of two fresh summation vertices. Since x and a are distinct, a dotted edge incident to a fresh 
summation vertex cannot be paired with a dotted edge incident to two original summation vertices. In the 
non-Gaussian case, higher-order moments are also present, but they are suppressed by a combinatorial factor 
(i.e. a positive power of M _1 ). Graphically, we represent the procedure of taking expectation by pairing (or 
in general lumping) fresh summation indices, and replace the corresponding doubled dotted line by a wiggly 
line. What follows is a more precise description. 

We define the second step of vertex resolution - the lumping of the fresh summation vertices. Before giving 



the general procedure, we complete the analysis of the example (8.5). From (8.5), assuming Simplification 
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(S3), we get 



(S3) 4 _ 2 

= mm 



(ab) 



4-2 

— mm 



J2 EG^G^G^G^G^Pa (h xa h ay h av h wa )P b (h zb h bu ) 

x,y,z,u,v,w 
(ab) 

Y, EG^G^G^G^*h av h wa h xa P b (h ab h bu ) , (8.7) 



where we used the trivial identity EX = EP a P b X and the fact that G^^ is independent of a and b (see 
Definition 2.3 1. Since a ^ u, the partial expectation P b on the second line of (8.7| vanishes. The partial 



expectations on the first line of (8.7) may be computed explicitly, similarly to the computation (3.38): 

Pa (hxahayhav kwa) PbiJ^zb^bu) 

SaxSavSbz$xy$vw&zu ~t~ SaxSaySbz^xv^yw^zu H V ^)s ax S bz 5 zu E\(^ ax | 



Here we assumed the condition (3.21). In the case (3.20), we get the additional term 



SaxSayfixwShzfiyv&zu ■ 



Thus we may write (in the case (3.21 ) for simplicity) 



(ab) 



Wn( b )n M a )* Mb) n(b)* (?3) m 4-2 F Mab) n (ab) Mab)* Mab) Mab)* 

al ''~ r fj,a'~ r ab { ~ r b ^ u al; U m — lit Hi J ^ S ax S av b bz <^ ^ x u iz z^ ° vv ^vv 

x,v,z 
(ab) 

2 F \ " s « n(ab) Mab) Mab)* Mab) Mab)* 

^ / j "ax&ayvbz <J ^x " yz " zfi u w " 'uy 



m^m 2 ^ 



(ab) 



iWEYslxSbzinCaxl 4 ) GWQMG%>GWG% 



(8.8) 



Note that the summations on the right-hand side are performed with respect to weights (see Definition 4.4 ) 



Now it is apparent how better estimates are available for each term on the right-hand side than the term on 
the left-hand side. Indeed, all terms contain five off-diagonal entries of G. In addition, however, the first two 



terms on the right-hand side contain a summation index associated with a chain vertex x (see Definition 5.1 ) 



which is summed over with respect to the chain weight s ax (see Definition 5.2). The last term is suppressed 
by an additional factor s ax M _1 . Invoking Proposition 5.3 (and neglecting the upper indices (ab) which 



are dealt with easily in the full proof below), we find that the arguments of E on the right-hand side of (8.8 ) 
are all 0^(<I ,5 <I>). Here the extra factor $ was extracted from the resolution of the vertex i associated with 
the summation variable a, and arose from the fact that ^ v*{T). 

Again, the lumping of fresh summation vertices is best represented graphically. In order to represent 



terms like the last term on the right-hand side of (8.8) graphically, we represent the expression E|/ij.-| 2+fc 



using a wiggly line crossed by k strokes. Thus, a wiggly edge may be either uncrossed or crossed. We shall 
always use the bound 



Pi\hi 



\2+k 



l+fc/2 
S - ■ 

y 



|2+fe 



< s l3 M- k ' 2 
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X 




Figure 8.3. The lumping of the fresh summation vertices in the example (8.5 1. The graph on the left-hand side is 
O (representing the first term on the right-hand side of ( 8. 7 1 ) , and the three graphs on the right-hand side are the 
elements of £(0) (representing the three terms on the right-hand side of (|8.8[l). 



in combination with crossed wiggly lines. See Figure 8.3 for a graphical depiction of (8.8 1 



The general case is similar to the example (8.8). 



Definition 8.5 (Lumping). A lumping is a partition whose blocks, called lumps, have size greater than or 
equal to two. 

Let F £ 25^ (A) and <E *K(r). From we generate a finite family of graphs T by taking all lumpings of 
the white vertices (fresh summation vertices) of 0. (This lumping, or identification of vertices, arises from 
taking the partial expectation Ilaea^a °f a ^ entries of H in A(Q). Note that each fresh summation vertex 
of must be lumped with at least another one because hij and hki are independent if {i,j} ^ {k, I}, and 
Pihij = 0.) Thus, the result of the lumping is to merge some white vertices into lumps, where each lump 
consists of at least two vertices, and is again represented as a single white vertex. We denote by £(0) the set 
of such graphs T obtained by lumping the fresh summation vertices of 0. Recall that all resolvent matrix 
entries encoded by the resolvent edges of have upper indices a. Hence any factors Q a with a £ a. act 

/ (a) \ (a) 

trivially on them according to the identity Q a (G ii X) — G 2 - Q a X for all a £ a. Therefore the factors Q 



only act on entries of H. As in the example (3.361, they simply forbid some pairings; this restriction is no 
importance for us, and we shall estimate the contribution of arbitrary pairings. Note that after the partial 
expectation in a has been taken, no factors Q remain. In particular, _4(Y) has no factors Q. 
Any graph T £ £(0) has the following properties. 

(i) T has no loops or multiple edges. 

(ii) The vertex set of T is a disjoint union V(T) = V f (T) U V S {T) U V e (Y), where V S (T) = V S (G) = V S (T) 
is the set of original summation vertices, and V e (T) = V e (Q) = V e (T) is the set of external summation 
vertices. (The set of fresh summation vertices V/(T) is strictly smaller than V/(0).) 
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(iii) A directed edge of 8 has one of two colours: solid (encoding G^) or dashed (encoding G^*). 
An undirected edge is always wiggly. An uncrossed wiggly edge {i,j} encodes E|/ijj| 2 = Sy, and a 



lj\ — 0131 

wiggly edge {i,j} crossed by k strokes encodes E\hij\ 2+k — Sy~ fc ' 2 E|£ij \ 2+k 



(iv) Each i £ Vf(T) is adjacent (via a wiggly line) to a unique vertex p(i) £ V^(T). Thus, the map 
p : V/(T) — > Vs(T) is a projection which associates with each fresh summation vertex i its "parent" 
original summation vertex p(i). (E.g. in the first graph on the right-hand side of Figure 8.3 we have 
p(x) = p(v) = a and p(z) — b.) 

(v) For each i,j £ V S (T), we have cr(i,j) £ {0,2,3,4,...}. (Recall that (r(i,j) is the number of dotted 
lines in the graph between vertices i,j £ V S (T) = V S (Q).) If <r(i,j) ^ 2 then i and j are connected 
by a wiggly line crossed by a(i, j) — 2 strokes. Moreover, the number of edges is conserved in the sense 
that 

\E(T)\ = Yl (1& = 1) + !&=*)) + ^ E ( 8J ) 

eefi(T) !jeV«(T) 



as follows from (8.6 1. 



See Figure 8.4 for an illustration of (v). 




Figure 8.4. The resolution process F i-> 6 ^> T, where in the first step we chose a Q satisfying a(i,j) = 2 where i 
and j are the original summation vertices associated with a and b respectively. This figure illustrates the property 
(v) above, as well as the conservation of the number of edges from (8.6 1 and (8.91. 



8.4. Summing over the fresh summation indices and completion of the proof. The complete process of 
vertex resolution may be summarized as 

T£& P F {A) i—). 6e<K(r) i— > Te£(6). 

Here the first step represents explicitly resolving all maximally expanded entries of G (encoded by resol- 
vent edges of T) in a-admissible entries of H . The second step represents taking partial expectation in all 
these h- variables. 

At this point, we introduce a further, and final, simplification that allows us to postpone some needless 
technicalities to Section [9l 

(S4) The sets x = (^i)igv>(T) an( l A* = (Mi)ieVi(T) are disjoint, and all indices of x are distinct. 
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Thus, Simplification (S4) is in the same spirit as Simplification (SI). We assume Simplification (S4) through- 
out Section [831 

Let T € <8^(A), 6 G 5H(r), and T € £(9). Choose and fix a marked vertex i € V m (T) (see Definition 



8.1). In order to gain an extra factor $ from i, we consider three cases, (a), (b), and (c) 

First we explain them informally. Case (a) is the typical situation. Since i is marked, it is a charged 
vertex in A, i.e. after having been minimally linked to, the number of solid and dashed edges adjacent to 
it are different. In case (a) we show that this property is inherited by at least one of the fresh summation 
vertices whose parent is i. This fact is fairly clear since neither vertex generation nor lumping alters the 
number of solid or dashed edges. This will imply that one fresh summation vertex generated by i is a chain 
vertex in T; this in turn will give the extra factor $ associated with i 6 V m (T). The other two cases represent 
exceptional cases. Case (b) deals with a higher-order lumping (i.e. a lumping which has a lump of size greater 
than two). As indicated above, this results in a combinatorial gain expressed in terms of powers of M _1 . 
Such a factor is depicted graphically using a crossed wiggly line. Finally, case (c) deals with the consequence 



of the factor h uv on the last line of (8.4), i.e. when a dotted edge joins two original summation vertices. We 
note that choosing the term h uv is the only way to change the number of solid or dashed edges (off-diagonal 
entries of G) in the process of vertex resolution. Since dotted edges must be paired, we find that at least 
two parallel solid or dashed edges must be replaced with a dotted edge; this corresponds to replacing two 
off-diagonal factors G uv with |/i ulJ | 2 . After expectation, this means trading in a factor \1/ 2 for M _1 . Since 
\E"I> ^ M -1 / 2 , the factor M~ x may be estimated by vl/ 2 $ 2 . Out of this, \I> 2 is used to compensate for the 
loss of ^> 2 mentioned above (losing two off-diagonal entries of G), and the remaining $ 2 provides us with the 
gains of <E> associated with each of the two vertices incident to the dotted edge. See Figure [S3] for a graphical 
depiction of the cases (a), (b), and (c). Below we formalize these ideas. 



Recall the definitions of Vi and v* for an arbitrary edge-coloured, directed multigraph (Definition 4.6 1 



Informally, v\ gives the number of legs of colour £ incident to i. The following definition gives a natural 
extension of chain weights to graphs with wiggly edges. 

Definition 8.6. A vertex i € V/(T) is a chain vertex if it has degree three, such that: (i) i is incident to 
exactly one wiggly edge, and (ii) i is incident to exactly two resolvent edges, which are of the same colour. 



Thus a chain vertex i G V/(T) corresponds to a chain vertex in the sense of Definition 4.6 (i.e. Vi(Y) + 
v*(T) — 2 and one of the terms vanishes), with the added condition that the summation over the index of i 
is done with respect to a chain weight (encoded in A(T) by the wiggly edge of T incident to i). 

Now we present the details of the cases (a), (b), and (c). Let T e <S^(A), 6 <E 5H(r), and T e £(6). Let 
i G V m (T) be marked. 

(a) Suppose that cr(i,j) = for all j e V S (T) (i.e. there are no wiggly edges in T that join two original 
summation vertices). Suppose moreover that the lumping of the fresh summation vertices of p _1 (i) 
that generates T is a pairing (hence deg r (z) must be even). 

Then Vj(T) + uJ(T) = 2 for any j 6 p _1 (i). More precisely, each j 6 p _1 («) has degree three in T; two 
of the incident edges are resolvent edges, and the third edge is a wiggly uncrossed edge connecting j 
with i. Moreover, 

^(r) = £ ^|(T) (8.io) 

for £ = 1, *, since, under the assumption that there are no wiggly edges in T that join two original 
summation vertices, the total number of resolvent edges incident to i does not change by vertex 



resolution and taking expectation. Hence we find from (8.2) and from i^*(T) + ^!(T) = 2 that there 
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Figure 8.5. A simple example of vertex resolution giving rise to the three cases (a), (b), and (c) when resolving 
the vertex associated with a. From top to bottom in the right-hand column: cases (a), (b), and (c). In the top 
graph, corresponding to case (a), a; is a chain vertex (within the chain u — > x — > fi) which is a remnant of the path 
b — > a — > n consisting of solid edges in the graph F. In the middle graph, corresponding to case (b), the crossed 
wiggly line expresses that all four fresh summation indices arising from the resolution of a coincide, and the four 
associated dotted edges of O were collapsed into one of T. Finally, in the bottom figure, corresponding to case (c), 
the solid and dashed edges between the vertices associated with a and b in T give rise to a single wiggly line according 



tO Sab 



E\h a 



must exist aj£p such that either Vj(T) — or v*j(T) = 0; this implies that j is a chain vertex 
of T. We conclude: 

At least one j € is a chain vertex of T. 

(b) Suppose that = for all j e V^(T), and the lumping of the fresh summation vertices of 

that generates T is a not pairing. 

In this case one fresh summation vertex j G was obtained by lumping together three or more 

fresh summation vertices of 9. We conclude: 

At least one j £ p _1 («) is connected in T to i by a crossed wiggly edge. 



(c) Suppose that cr(i,j) > for some j € V S (T). By property (v) of Section 8.3 a(i,j) > 2. By construction 



of T, if a(i,j) ^ 2 then i is joined to j in T by a wiggly edge that is crossed by — 2 strokes. We 

conclude: 
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The vertex i is connected in T to some j G V S (T) by a wiggly edge. 



Partition V m (T) = Vm^ U vffl U Vm^ into three subsets according to the three case (a), (b), and (c). Now 
we may estimate the cont ribu tion of T. For the whole estimate we freeze the original summation vertices a. 



We shall use Proposition |5.3| on the \\m \ chain vertices of T. In order to do so, we still have to get rid of 



the upper indices (a) from each resolvent entry. This is done exactly as in the end of Section 7.1 using the 
identity (7.3 1. We omit further details. Thus Proposition 5.3 is applicable to each chain vertex of T. The 



remainder of the proof is a simple counting of different types of vertices. 
Let 

i,jev s (e) 

denote the number of dotted edges of that join two vertices in V s (9). From 

|i?(r)| — a resolvent edges. Moreover, T has \Vm \ chain vertices. The gain from cases (b) and (c) is as 
follows. From the vertices of type (b) we gain M~\ v ™ } \/ 2 . (Each such vertex is incident to a crossed wiggly 
line, and dropping the strokes crossing such a lines yields a factor M -1 / 2 .) From the vertices of type (c) we 
gain M~°l 2 . (Each dotted edge, encoding h uv = (s OT ) 1,,2 ( M „, yields a contribution of size (s^) 1 / 2 ^ CM~ X I 2 
after taking the expectation in the h- variables.) 



9) we find that T has 



Now we sum over x (while still keeping a frozen). Invoking Proposition 5.3 to estimate the chain vertices 
of T, we therefore find that the contribution of T is bounded by 

X := q>\ E ( T )\-° <f>\ v i a) \ M-\ v ™ } \ /2 M- a/2 sc q\ E ^\- a <f>\ v i a) \+\ v ™ } \ M- a/2 , 
where we used M~ 1 / 2 ^ $. Since <j(i,j) > 2 in case (c), it is easy to see that \Vm | ^ a. Thus we have 



\v, 



(a) I 



V£ ] \ +a > \ V m (T)\. This yields the bound 



where we also used that <I"I > ^ M 1 / 2 . Recalling Lemma 



we find 



X < #P(deg(A)+|F|)+|V (r)|-|V m (r)| $|y m (r)| < ^p(do g (A) + |F|) $ p|V c (A)| 



3.11) 



where we used |y m (r)| ^ |V^(r)| = p|V c (A)| and ^> ^ This estimate was obtained for the sum over x 
with fixed a. Finally, we sum over a trivially, using the fact the a-summation is performed with respect to 
a weight. This concludes the proof of Theorem |4.8| under Simplifications (SI) - (S4). 



9. Removing Simplifications (SI) - (S4) 

In this section we go back to the proof of Theorem |4.8| of Section |HJ and give the additional arguments 
required to remove the Simplifications (SI) - (S4) which were assumed there. For ease of reference, we recall 
them here. 



(SI) All summation indices in the expanded summation E|AT^(A)| P (see (6.5) below) are distinct. (I.e. we 



ignore repeated indices which give rise to a smaller combinatorics of the summation.) 
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(S2) There are no diagonal entries Q aa — G aa — m in Z(A). (I.e. A has no loops.) 



(S3) We replace any diagonal term G^} with m and any diagonal term G y aJ* with m. (Recall that Gai 



(T)» 



m 



in the sense that G aa — rn -< W by definition of A.) This replacement is done in two places: in *4(F) 



and in the identities (3.14a) and (3.14b) which underlie the algebra of vertex resolution. 



(S4) The families x = {xi)i£v f (r) and M = (Mi)ieVe(T) & re disjoint, and all indices of x are distinct. 

9.1. Removing Simplification (S3). We start by removing Simplification (S3), while still assuming Simplifi- 
cations (SI), (S2), and (S4). In order to remove Simplification (S3), we have deal with the error terms made 

(T) ... i i 

in the replacement G aa ^ m. The key formula for dealing with the diagonal terms is (3.14c) with the error 
terms h aa , Za T \ and Ua Ta ^ (see (3.15)). Recall that by (2.8) we have h aa -< M -1 / 2 . The other error terms 



are estimated in the following lemma. 

Lemma 9.1. Suppose that A -< 'J for some admissible control parameter "J . Fix £ G N. Then we have 

Z^ T) ~< U^ s) -< min{^ 2 ,*} , (9.1) 

for \T\,\S\ ^£ and a € {1, . . . , N} \ T. Moreover, is independent of T and Ud is independent of S . 

Proof. See Appendix [A] □ 



Using Lemma 9.1 and (3.14c) we write, for any fixed K € N, 



r (a\M) 



rz(a\{a}) _ T-r(a) z^(a\{a}) 



K-l 

^Tm fe+1 

fc=0 



{-h aa + z^ a » + upy + o^ K ) . 

(9.2) 



(The error term 0^{*$> ) is uniform in the same sense as the estimates of (9.1 ).) 



In this section we revisit the argument from Section [8j and explain the differences resulting from the 
added diagonal terms. 

9.1.1. Generation of the fresh summation vertices (revisited), Step I: from T to H. As in Section 8.2 we start 
with L G 0^(A). The goal in Section 8.2 was to decompose L into a finite union of graphs (called 9^(L)) 
whereby a resolvent edge of O G 9\(T) encoded in .4(0) an entry of or G^* , and a dotted edge an 
a-admissible entry of H. Thus -4(0) was well-suited for taking the partial expectation in a. In this section 
we keep track of the diagonal entries (represented graphically by loops) that arise both in Family A and B 
identities and were previously freely replaced by powers of m and m, according to Simplification (S3). The 
main difficulty here is that resolving diagonal entries requires the more complicated formulas (19. 21) instead 



of (8.4) (which immediately yielded resolvents with upper indices a). 

The ultimate goal of this section and of Section [9.1. 2 1 is the same as that of Section |8T2} to obtain graphs 
O whose resolvent edges encode resolvent entries with upper indices a (up to a negligible error term that 
can be estimated brutally). We shall reach this in two steps. In the first step, which is the content of this 
section (Section 9.1.1), we express -4(r) as a sum of monomials whose off-diagonal resolvent entries have 
upper indices a and whose diagonal resolvent entries are maximally expanded. We denote by 2)(F) the 
family of graphs encoding these new monomials, and we shall use the letter II for a generic element of S(F). 
Graphically, therefore, this step corresponds to the mapping T h- > {II Q } = S(F). Sometimes we shall refer 
to it informally as F II. 



Gl 



In the second step, which is the content of Section 9.1.2 we use (9.2 1 to replace all maximally expanded 



diagonal resolvent entries (in A.(U) for any II € ®(r)) with resolvent entries having upper indices a (again 
up to a negligible error term that can be estimated brutally). We generically call the resulting graphs 0, 
and let 9t(n) = {0 a } be the collection of such graphs obtained from a fixed II G 2)(r). Graphically, this 
step corresponds to the mapping II i— > {9 a } = 1H(II). The graphs 8 play the same role as the graphs in 
Section [8j Indeed, each resolvent edge of encodes in *4(0) a resolvent entry that has upper indices a, and 
each dotted edge an a-admissible entry of H. Hence .4(0) is amenable to taking the partial expectation in 
a. (This will be done in Section 9.1.3 ) 



Lemma 9.2. For any K € N we have the decomposition 



A(T) = J2 A « + 0^ K ). 



where the summation is over a finite N -independent set, and A a is a monomial in the entries of G^ T \ the 
entries of G^* , and the a-admissible entries of H. Moreover, each entry G^uJ of A a satisfies the condition 

(*) G^v either has upper indices T = a or is a maximally expanded diagonal entry (u — v and T = a\{u} ). 

The same condition also applies to each entry G^v* . 

We shall apply this lemma below with the choice K := p(deg(A) + 2|T4(A)|), which will ensure that the 
error term 0^(^f K ) is negligible. 



Proof of Lemma 19.21 We apply (8.4) to each off-diagonal (maximally expanded) resolvent entry of A(T). 



After an application of (8.4), the resulting expression does not in general satisfy (*), due to the factor 
Gum "'^ on the last line of 



4), which is not maximally expanded. (Note that all other factors on the 



right-hand side of 



rf{a\{u,v}) _ r<(a.\{u}) , 

^* 11.11. ^ 11.11. 



.4) satisfy (*).) 



As always, we use (13.131) to make G4* ^ u ' v '' maximally expanded: 



r ,(a\{u,i)}) 



G 



(a\{u,l)}) r ,(a\{u,ti}) 



r ,(a\{u,-u}) 



n (a.\{u,v}) 



r (a\M) 



(9.3) 

Here we use the first identity of (9.3). The first term is maximally expanded and good as it is. The second 



consists of two maximally expanded off-diagonal terms in the numerator and one diagonal term in the 
denominator which is not maximally expanded. We now apply (8.4) to each of the terms in the numerator. 



The result is an expression with entries of G that either have upper indices a or are diagonal. The diagonal 



entries are not maximally expanded, and hence we must apply (3.13) to each of them. Moreover, the diagonal 



entry in the denominator is not maximally expanded, and must be further expanded using the second identity 
of (9.3). We continue in this manner, successively using (8.4) on maximally expanded off-diagonal entries 



and (3.13) on diagonal entries that are not maximally expanded. This procedure is reminiscent of the one 



introduced after Definition |6.4| As in Section [6j although this procedure in general does not terminate, it 
does increase the number of off-diagonal terms, which allows us to stop brutally once a sufficient number of 
off-diagonal terms have been generated. Note that, unlike the one-step iteration of Section [6] (which only 
used (3.13)), we now have a two-step iteration, which repeatedly uses (3.13) and (8.4) in tandem. 



More formally, the algorithm may be described as follows, 
precisely, we set 



In order to define the brutal stopping rule 



£(A) := (number of entries of G ( - a - ) in A) 



(number of entries h a ^ in A) , 

a,beV B (T) 



G2 



where A is a monomial in the entries of G^ and H. 
monomial in the algorithm, and A(T) is its initial value. 



Now set A ■= A(T); A will denote the running 



Step 1. Pick an off-diagonal term G 



(a\{«M) 
uv 



in A which does not have upper indices a. If no such term exists, 
This yields the splitting A = A' + A" (where A' is 
the main term that contains a factor G^ and A" = unless both u and v are summation vertices, in 
which case A" contains the special factor h uv from the third line of ( 8.4 ) ) . Repeat step 1 for A' and 



go to Step 2. Otherwise apply <|S.4b to G^ U "^ }) 



Step 2. 
Step 3. 



A" (provided A" ^ 0). (Notice that at each repetition of Step 1 the number of off-diagonal terms with 
no upper index a decreases by one, so after finitely many steps the algorithm exits to Step 2.) 

If 1{A) ^ K, stop. Otherwise go to Step 3. 



Pick a diagonal term G 



(b\{u,v}) 



Otherwise apply ( |9.3[ ) to Gi,„ u ' v . This induces a splitting A = A' 
summands in cither identity of (9.3). Repeat Step 1 for both A' and A". 



in A that is not maximally expanded. If no such term exists, stop. 



A" according to the two 



Since Step 1 increases £ by exactly one, it follows by Step 2 that the algorithm must terminate after a finite, 
N- independent, number of steps. The result is a finite sum of terms {^4 Q } whose number does not depend 
on N. Pick one such A a = A. We consider two cases depending on whether the algorithm, in generating A, 
stopped at Step 2 or Step 3. In other words, we differentiate based on whether it is stopped because there 
are sufficiently many small factors (stopping at Step 2) or because each resolvent entry satisfies (*) (stopping 
at Step 3). 

Consider first the case where the algorithm stopped at Step 2. This corresponds to a brutal stopping, 
where we may estimate A by a simple power counting using the lower bound on 1(A). We claim that we 
have the trivial bound 

A = 0^ K ). (9.4) 
In order to see this, we note that \d>A\ (reading these formulas from right to left) and ( 2.10[ ) implj{^] 



(a) 



E^^ 1 = o-i(*). 



(a) 

y G (a) /i 



(a) 



= O^), Y,h ux G%Jh xv = O^V). 



(9.5) 



Moreover, by definition of Step 1 and the explicit expressions in (8.4) each entry of G^ in A comes in one 



of the three forms in (9.5). Hence the lower bound £{A) ^ K, the definition of £, and (2.10) yield (9.4). 

Apart from this error term, all other terms resulted in stopping the algor ithm at Step 3. In this case it 
is immediate that each entry Gu V of A satisfies (*). This proves Lemma 



9.2 



□ 



The algorithm from the proof of Lemma 9.2 (Steps 1-3) has a trivial reformulation on the level of 



graphs. This also yields a convenient graphical representation of the monomials {A Q }. For future use, we 
give more details on the graphical version of Step 3. Let 33 (r) denote the set of graphs that encode the 
monomials {A a } obtained through Steps 1-3 starting from F and stopping at Step 3. The elements of 33 (r) 
will generically be denoted by II. 



We use the graphical notations from Figures 3.2 and 3.8 
graphically. 



In Figure 9.1 we summarize the rules (8.4) 



The term A" from Step 3 arises from taking the second term on the right-hand sides of (9.3), which in 



the graphical language translates to creating two (non-loop) resolvent edges connecting the vertices i and j 



Note that these estimates also follow directly from basic large deviation results such as Lemmas B.l and B.2 in |15| 
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Figure 9.1. The graphical representation ol the rules (8.4 1 (using a and b for summation indices and fi for external 
index instead of the generic indices it and v). Two of the loops on the last line correspond to resolvent entries 



,(a\{ ,i.}) 

loop. 



which are not maximally expanded: they still depend on a, which is indicated by the label [a] inside the 



associated with the indices u and v respectively, i.e. linking a loop at i with j. We call this process linking i 
with j. See Figure [9T2[ 





Figure 9.2. Linking the vertex i (depicted in the picture with its associated index u) with the vertex j (depicted 
with index v). These two diagrams correspond to the two identities in (9.3 1. As in Figure 9.1 if a diagonal resolvent 



entry encoded by a loop is not maximally expanded, we indicate the index on which it depends in angular brackets 
inside the loop. 



This graphical algorithm provides an alternative, graphical, construction of D(T) starting from T. Each 
II € S(r) encodes a monomial .A(II) whose resolvent entries satisfy (*). As in Section [8j the vertex set of 
II may be written as T^(II) = V/(II) U (II) U 14(11), corresponding to the fresh summation vertices, the 
original summation vertices, and the external vertices, respectively. Note that II now contains loops, which 
bear either a black or white diamond (encoding diagonal entries of G in the numerator or denominator 
respectively). Moreover, each diagonal entry encoded by a loop of LT is maximally expanded, and each off- 
diagonal entry encoded by a non-loop edge of II has upper indices a. See Figure |9.3| for a simple example of 



G4 



the process F i->- II € ID (T) . 





Figure 9.3. The process T h-> II (E S(F), where we draw the two simplest elements of T)(F) on the bottom line. 
(For reasons of space, we omit the labels a and b on the bottom line.) The first graph is just the top-middle graph 
of Figure 18.51 but with added loops. In the second graph one loop at a was linked with b. 



9.1.2. Generation of the fresh summation vertices (revisited), Step II: from II to O. In this section we 
complete the second part of the generation of the fresh summation vertices, by constructing a family of 



graphs O € £H(II) from II. The underlying algebraic identity is (9.2). 
Lemma 9.3. For any K € N and any II £ S(T) there is a decomposition 



(9.6) 



such that each B a is a monomial in the entries of G^ and the a-admissible entries of H. The sum over a 
ranges over a finite set that is independent of N. 



We shall apply Lemma 9.3 with the choice K :— p(deg(A) 
term 0^(^f K ) is negligible. 



2|V S (A)|), which will ensure that the error 



Proof of Lemma T9.31 We simply apply (9.2) to each diagonal resolvent entry of „4(II). Recall that each 



diagonal resolvent entry of -4(11) is maximally expanded, which implies that all resolvent entries explicitly 



appearing m the definition (|3. 15h for Z { a A{a}) and ui &) have upper indices a. Then, as above, it immediately 



follows that if we pick the rest term 0^(^ A ) in (9.2) from any diagonal entry, the resulting monomial is 



(K) and may be absorbed into the error term on the right-hand side of ( |9.6[ ) . 

For the following we therefore assume that there are no rest terms 0^(^ K ) in the expansion (9.2) of the 



diagonal resolvent entries of .4(11). The result is a finite family of monomials whose number does not depend 
on iV (but does of course depend on K), and which may again be represented graphically. In such graphs 
we represent a term with a solid ring around the vertex associated with a (these terms Ua^ will not be 
expanded further, so their precise structure does not matter; the number of rings simply encode their size). 
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z (a\{a}) 



C/i a) = a® 



Figure 



.4. The graphical representation of the three error terms resulting from the expansion of using 



(9.2|. Apart from the entries of H encoded by the dotted edges, all terms are independent of a. 



See Figure 9.4 for a depiction of the three nontrivial terms arising from the expansion of Gaa^ • Thus, 



when expanding a loop at a with a white diamond (encoding 1/G, 



(a\{a» 



), we replace the loop with either 



nothin g (c orresponding to the term 1 jra used in the argument of Section [8| or one of the three pieces in 



Figure 9.4 Similarly, when expanding a loop at a with a black diamond (encoding G^Z^ a ^), we replace the 
loop with either nothing (corresponding to a factor m coming from the zeroth order term in the summation 



in (9.2)) or an agglomeration of pieces from Figure 9.4 at the vertex associated with a. (We use concentric 



rings around a to depict several factors Ua). See Figure 



9.5 
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Figure 9.5. Expanding a loop. The two lines correspond to the two identities of (9.2 1 respectively. 



The application of (9.2) to the diagonal entries encoded by II yields a new family of graphs, which we 
call *H(II) and whose elements we denote by 8. Each resolvent edge of 9 € now encodes an entry of 

G (a) or G (a)*^ We alsQ havc the ugua] S elf-explanatory splitting V(G) = F/(G) U V S (G) U V e (Q). See Figure 



9.6 for an example of the process II i-> $H(II). □ 



Summarizing the results of this Sections 9.1.l| and |9.1.2[ for a given T £ ©^,(A), we have constructed an 
iV-independent set of graphs, 

51(3) (T)) = |J{$H(n) : n e D(r)} . 

If O € ^(©(r)) then each resolvent entry of A(Q) has upper indices a, and the fresh summation indices 
x = {xi) i( zv f (Q) and the original summation indices a = (ai) ie v s (Q) are disjoint. Moreover, we have the 
splitting 

(a) 

^( r ) = E E- 4 ^( e ) + °^( vI,p(dcs(A)+2|ys(A)l) ) 

eeK(s(r)) x 



where we explicitly indicated the set of summation indices in the subscript of A, see (8.3). Note that the 
elements of the family d\(D(T)) have the same properties as the elements of the smaller set *H(r) from 
Section [3 
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e 



+ n 






Figure 9.6. The process T H y PI i— > O. On the second line we draw four graphs O corresponding to choosing one of 
the four terms on the right-hand side of the first equation of (9.2 1. On the third line we draw some more complicated 
graphs O. 



9.1.3. Lumping of the fresh summation vertices (revisited) and conclusion of the estimate. Fix a G 

3 and take the lumping of the entries of H in A(Q) by 



9\(D(T)). Now we may proceed as in Section 
computing their partial expectation n oea iV Since a ^ resolvent entries of A(@) are independent of a, this 
partial expectation acts only on the entries of H, and leads to lumpings exactly as in Section |8.3| This 
gives rise to a family of graphs T € £(©)■ As before, we seek to gain a factor 4> from each marked vertex 

ieV m (T). 

Thus, let us fix a sequence r h> II H> H> T. It is convenient to extend the definition of the degree of 
a vertex as follows. By definition, the degree of i € V(<d), written dcg e (i), is equal to the number of legs 
incident to i plus two times the number of rings around i. This convention is chosen so that each error term 
in Figure |9.4| increases the degree of i by two. 

Now take a marked vertex i £ V m (T). Note that, by construction of II and 0, we have deg e (i) ^ deg r («). 
We consider two cases. 

(i) Suppose that deg e (z) = deg r (i). This means that in the process T t— > II the original summation vertex 
i was not linked with another original summation vertex (see Section 9.1.1), and that in the process 
II >->• (see Section 9.1.2) we always chose the main term (1/m or m) on the right-hand sides of (9.2) 
when applying (9.2) to any diagonal entries with lower indices a^. In particular, (8.10) holds. We 



may therefore proceed exactly as in Section [8] any pairing in i— > T of the white vertices adjacent 
to i gives rise to at least one chain vertex of T (see Definition 8.6). A higher-order lumping (i.e. one 
that is not a pairing) gives rise to a positive power of M^ 1 / 2 ^ "J. Either way, we shall gain a factor 
$ from i after summing over x and invoking Proposition |5.3[ 



(ii) Suppose that deg e (i) > deg r (i). In this case we have that cither 

(11.1) in the process r i— > FI the vertex i was linked to another original summation vertex, or 

(11.2) in the process II n- we chose at least one error term (represented graphically by one of the 



graphs in Figure 9.4) on the right-hand sides of (9.2) when applying (9.2) to the diagonal entries 



with lower indices i. 
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We claim that either case, (ii.l) or (ii.2), results in an extra error factor in T of order 

In order to see this, consider first the case (ii.l). Here the linking means that there is a j £ V S (T) 
such that in T we have two extra resolvent edges (as compared to case (i)), each connecting a vertex 
in p _1 (i) to a vertex in p _1 (j). This yields a factor ^> 2 . Thus we gain a factor "J that we ascribe to i 
(the other factor '5 is in general not available, as it may be needed for exactly the same reason at the 
vertex j). Next, consider the case (ii.2). If there is a term h aa or Ua m *4(0), then we immediately 
get a factor (For Ua^ this is trivial by (9.1) and for h aa taking P a implies that there must be 



at least another factor h aa , in which case we get a factor AI ^ 5 , $.) Finally, if we have a factor 
^(a\{a}) okggj.-y-g that m the expression 



/ (a) 

z (a\{a» = n hTh G (a) h 
\ x,y 



(a) 

(9.7) 



we cannot pair h ax with h ya when computing the partial expectation P a (since P a Q a = 0). (Of course 
in a higher-order lumping, they could be in the same lump provided this lump contains at least three 
elements.) This implies that, in the leading-order pairing, the fresh summation vertices of O associated 
with x and y will be paired into differ ent v ertices of T. In particular, we gain an additional off-diagonal 



fa) 

resolvent entry G xy < <J . See Figure 9.7 for a graphical depiction of this lumping. 




Figure 9.7. The lumping of fresh summation vertices in the presence of a factor Za (represented by a triangle 

in B.). Due to the Q a in the definition of Zi a ^ a ^', the last graph (crossed out in grey) does not contribute. 



In summary, each marked vertex i therefore yields a gain of $ upon summation over x. This concludes the 
proof of Theorem 4.8 without Simplification (S3). 



9.1 



and 
— m 



9.2. Removing Simplification (S2). In this section we revisit the arguments of Sections M and 
explain the modifications required if we relax Simplification (S2), i.e. allow diagonal entries Q aa — 
in the definition of Z. (On the level of A. this amounts to allowing loops.) The construction of T G <&£(A) 

remains unchanged. Each diagonal entry of -4(r) is maximally expanded, i.e. of the form Gaa X ^ a ^ ■ Hence 
the construction of n £ S(r) from T carries over unchanged from Section 9.1.1 Now n has loops of three 
kinds: with a black diamond (encoding G^a )j with a white diamond (encoding 1/gIo ), and plain 
(encoding £7oa )• Note that a decorated loop encodes a factor of size O^(l) while a plain loop encodes 
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a factor of size 0^(\E'). The additional difficulty in this section as compared to Section]8]is that the naive 
size of a plain loop is smaller than the size of the decorated loops dealt with in Section (8J Thus we have to 
establish bounds which, in addition to the gain extracted in Section [HJ also contain the smallness associated 
with the naive size of a plain loop. 

The process IT € £H(n), in whi ch all (maximally expanded) diagonal entries are expanded using 

' For l/Gia U ° }) 



(9.2) is again the same as that of Section 



tor yaa we use 



9.1.2 



and G { a a } {a}) we use p.2n, and, in addition, 



K-l 



C (a\{a}) = 



,fc+l| 



-h aa + z<f\^ +0^ K ) 



(9.8) 



k=i 



(note that th e sum starts with k 
As in Section 



1). Finally, lumping the white summation vertices yields the graph T. 



9.1.3| the important observation is that the two white vertices associated with a Za a (see 



Figure 9.4) cannot be paired. In summary, the resolution process rH>IlH>0H>Yis almost identical to 
that in Sections [8] and 9.1 There is only one new ingredient: the expansion ( |9.8[ ) which starts with k = 1. 
To illustrate this procedure, let us consider the simple example with a = {a} 



(a) 



^Ga^Qaa E ^ G aa G aa ha X G^ x ^G^ hyaQaa 



x,y 



(a) . (a) 

2 ~E^/? G^G^*h ( O h (7 (a) /? -h + f/ (a) 

x,y ^ z,w 



m m 



(a) 



m 2 m 



E^Sn8ayGffiG$G<$ +0 + m 2 mE^ s ax G$* G^U^ 



(9.9) 



where + 

in the parentheses vanishes because Eh aa = 0. See Figure 19.81 for a graphical version of (19. 91) 



denotes higher-order terms in the expansion (9.2 ) and (9.8 1 . The expectation of the middle term 



Note that each unmarked loop (encoding a diagonal entry of Q) contributes a factor 0^(^) to A(T). 
When performing the vertex resolution T H> T, we therefore have to ensure that this gain of \& is not lost 
(i.e. that A(T) has an associated factor of size O^('I')). In addition, we have to gain a factor $ from each 
marked vertex of T. 



In the example (9.9), the vertex a is marked and each term on the bottom line of (9.9) is of order \l/ 3 $ 



This bound should be read as \I> 2 \I> <I>, where \E' 2 is the trivial bound on the off-diagonal entries, ^> is the 
bound on the diagonal entry of Q, and $ is the additional gain arising from the fact that a is marked. Indeed, 



the first term on the bottom line of (9.9) is of order \I/ 3 <1> by Proposition 5.3 and the last term of order 'J 4 
by Lemma [9T| 



This is in fact a general phenomenon. Let i G V m (T) be marked, with associated summation index a. 
We shall give the details only for a leading-order graph T, i.e. a graph T that satisfies: 

(i) In the process r i — >• II the original summation vertex i was not linked with another original summation 
vertex. 



(ii) In the process II i— >■ we always chose the main term (1/m or m) on the right-hand sides (9.2), and 



the term mZi a ^ a ^ on the right-hand side of (|9.8|) 
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Figure 9.8. The complete resolution process T t-t II 



e 



T for the example (9.91. At each step we only draw 



the leading-order graphs. The first, second, and third lines of the figure correspond to the first, second, and third 



lines of (9.9 1 respectively. 



(iii) In the process i— > T, we chose a pairing of the white vertices incident to i. 
lumping is allowed.) 



(I.e. no higher-order 



If T does not satisfy (i) - (iii), an argument almost identical to that of Sections 8.4 and 9.1.3 yields an 
extra factor in addition to where I is the number of plain loops incident to i in T. (This is a simple 

and zi A{a}) yield a 



(a) 

power counting that uses the fact that each Ua yields a factor 5'$, and each h. 



factor \P each. That Za a yields a factor \& follows from the observation that after resolution it yields an 



off-diagonal resolvent entry in -4.(T), as explained after (9.7). Note that, unlike in Section 9.1.3 where it was 



enough to gain a factor 'F from U^ 1 , here it is crucial that -< \I/<I>. 

Let us therefore assume that T s atis fies (i) - (iii). By (i) and (ii) we have (8.10). Recall the definition of 
the projection p from (iv) in Section 8.3 Each j G p~ l (i) is incident to precisely two resolvent edges and one 



wiggly edge that is also incident to i. Moreover, no vertex of p is incident to a loop; this follows from 



the above observation that the two white vertices associated with m^ a ^°" cannot be paired. From 



• 2) 



and (8.10), we therefore find that at least one vertex in p (i) is a chain vertex. Consequently summation 



over x results in an extra factor $ by Proposition 5.3 and hence completes the argument. 



9.3. Removing Simplification (S4). In this section we remove Simplification (S4), by allowing the fresh 
summation indices x to coincide with each other and with external indices /x. This entails proving Proposition 
5.3 without the simplifying assumption (S4) that was assumed in its proof. Roughly, there are two kinds of 
problems arising from such coincidences: an off-diagonal resolvent entry G xy may become diagonal (hence 
leading to a loss of a factor ^>), and a chain vertex may cease to be one (hence leading to a loss of a factor 
$). However, these losses are compensated by powers of M~ l resulting from a reduction in the number 
of independent summation variables. The main point is to prove each coincidence of summation variables 
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results in a loss of at most two factors of 4" and at most two factors of $. Since M 1 ^ \p 2 $ 2 ; the gain of 
M -1 will be enough to compensate this loss. 

Throughout Sections |J [9TTJ and [9T2J we invoked Proposition |5.3| in order to gain from chain vertices. 
To that end, we had to assume Simplification (S4) (since the indices (x, fi) are assumed to be disjoint in 



Proposition 5.3). The main result of this section is the following extension of Proposition 5.3 It states that 
the stochastic bound of Proposition 5.3 is valid even if the summation over x has no restriction. (As in 



Proposition 5.3 we use a to denote the summation indices; in our applications of Proposition |9.4| a always 



consists of fresh summation vertices which we denoted by x in Sections [8 9.1 and 9.2 ) 

Proposition 9.4. Suppose that A -< ^ for some admissible control parameter ^> . Let A be a chain encoding 
Z & . Then 

^w(a)Z a -< $dcg(A) $ c(A) (g 10 ) 
a 

for any fi and chain weight w. 

Proof. The basic idea is to split the summation into partitions 

$>(a)Z a = ^5^l(7'(/i J a)=P)«;(a)2 a , 



where P ranges over all partitions of V(A), and V was introduced in Definition 4.3 Note that, since fi are 



constrained to be distinct, if P yields a nonzero contribution each of its blocks may contain at most one 
vertex in V e (A). For the following we fix a partition P and prove that 



is stochastically bounded by the right-hand side of (9.10). On the level of the graph A, a nontrivial partition 
P of V(A) results in a merging of vertices V(A). By merging vertices of A we therefore get a new graph which 
we denote by P(A). The vertex set of P(A) has the usual decomposition V(P(A)) = V e (P(A)) U V S (P(A)), 
where V e (P(A)) = V e (A) and V S (P(A)) is given by the set of blocks of P that do not contain a vertex from 
V e (A). A vertex i € V(P(A)) is unmerged if the corresponding block has size one, and merged otherwise. 



See Figure 9.9 for an example of the merging A M> P(A). 




Figure 9.9. The merging A t-t P(A) of summation vertices of a chain. The vertices of A are V S (A) — {1, . . . , 7} 
and V e (A) = {0,8}. We chose the partition P = {{0, 1, 2}, {3, 7}, {4}, {5, 6}, {8}}. 



For concreteness assume first that A is an open chain with V^(A) = {0, . . . ,n}, where V e (A) = {0,n}. 
Thus, deg(A) = n. For any graph A' define the set of vertices 

Vg(A') ■■— {i £ Vs(A') : i has degree two without counting loops} . 
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Thus, the set V g (P(A)) includes not only the chain vertices of P(A) but also chain vertices to which one more 
loops are attached. Let r(A') denote the number of edges of A' that are not loops, and set c(A') := |V^(A')|. 
In particular, if A' is a chain then this definition agrees with that from Definition |5.1| (For example, in 
Figure [9J| we have c(P(A)) = 2 and r(P(A)) = 5.) 

Define k := n—1— \V S (P(A))\. Informally, k is the number of summation vertices of V S (A) that have been 
merged into some other vertex. As we shall see, k is the exponent of M^ 1 which describes the reduction in 

V S (P(A))\ = 3, which gives k = 4.) 

(9.11) 



and 



the combinatorics of the summation. (In Figure 9.9 we have n 
We claim that 

n-k ^ r(P(A)) n, r(P(A)) + c(P(A)) ^ 2n - 1 - 2k 



The easiest way to prove (9.11) is by the following inductive argument. We construct P(A) from A by 
successively merging one vertex at a time, and follow the change of the functions r(-) and r(-) + c(-) at each 
step. The formal procedure is the following. We construct a sequence of graphs Ao = A, Ai, .. . , A^ = P(A) 
as follows. We start from A := A. Recall that the vertices of A are naturally ordered by ^. Let i\ £ V S (A) 
be the smallest vertex of A that is in a nontrivial block (i.e. of size greater than one) of P. Set Ai to 
be the graph obtained from Ao by merging i\ with the (unique) vertex j <E V(Aq) satisfying j < i\. The 
vertices of Ai remain ordered after we assign the newly created merged vertex the index j. Similarly, A; + i 
is obtained from A/ by choosing the smallest unmerged vertex i\ <G V s (Ai) that is in a nontrivial block of 
P, and merging it with the unique j € V(Ai) satisfying j < i;. After k steps of this procedure, we obtain 
Afc = P(A). Moreover, it is easy to see for ^ / ^ k — 1 that 



r(Aj)-l < r(Aj + i) r(Aj) , r(A l+1 ) + c(A m ) ^ r(A ; ) + c(A,) - 2 . 



(9.12) 



Indeed, either ii is merged with a vertex adjacent to itself, in which case we have r(A/ + i) = r(A;) — 1 and 
c(A/ + i) c(A;)— 1, or ii is merged with a vertex not adjacent to itself, in which case we have r(A; + i) = r(A;) 
and c(A ;+1 ) > c(A z ) - 2. Since r(A) + c(A) = 2n - 1, ( pll) ) follows from ( pH2 ). 

We may now sum over (<Xi)igvi(P(A))- To that end, if i € V S (P(A)) and there is a loop (or several loops) 
at i, then we expand each corresponding diagonal term G aiCli as G aiai = m + {G aiai ~ to). If we pick a factor 
to from each loop, i becomes a chain vertex. If we pick at least one factor G aiai — to, i is not a chain vertex 
but carries a factor of order ^. Either way, summing over a« yields a factor $ by Proposition 5.3 (Note 



that Proposition 5.3 is applicable to the graph P(A) because all summation indices are constrained to be 



distinct.) Thus we get the bound 



-l-2fc 



where in the last step we used ( |9.11| ). Since M^ 1 *" 1 *" 2 ^ * sC 1 we find X P ~< W 1 ® 71 - 1 , which is < \9.10 ). 

The case of a closed chain A of degree n is handled similarly. For definiteness assume that A has no 
external vertex. Now we have k := n — \V S (P(A))\ ^ n~ 1 and we let I range from to k. Then (9.121 holds 
for I = 0, . . . , n — 3. If I = n — 2 then ( 9.12[ ) is in general false (as can be seen e.g. on the open chain of degree 



two with V S (A) — {1, 2} and P = {{1, 2}}). In that case we replace it with the trivial bounds r(A„_i) ^ 
and c(A„_i) ^ 0. Thus if k ^ n — 2 then we find (9.10) exactly as above, and if k = n — 1 we get using 

n > 2 



X P ~< M~ fe <F r(p(A:)) <I> c(p(A)) < M~ n+1 sC 



which is (9.101 



□ 



To conclude this section, we address an issue concerning coinciding indices that was repeatedly swept 
under the rug in Sections [7l|8|[9T| and|9?2| Essentially, we do an inclusion-exclusion argument on the values 
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of the summation indices of a union of chains so as to decouple the summations associated with different 
subchains. Recall the definition of Xp(A) from ( |4.5| . 

Lemma 9.5. If A = Ai U • • • U Aj, is a unioi^\ of chains then 

k 



1 = 1 



In words: if A is a union of chains, then in the summation over a in Xq(A) we can decouple the 
summations associated with different subchains of A. (In Xq(A) these summations are coupled by the 
constraint that indices associated with different subchains are distinct.) 



Proof of Lemma 19.51 This is a simple decoupling of the summation indices. Let a' and fi l denote the 
summation and external indices of A;. Abbreviate Z^ t (A/) = Z l &l . Thus we have 

(M x -M fe )* 

X (A) = Y, w{8,\..., a k )Zl---Z k ak = £ I(aV..,a*MaV..,a*)Z£ l 



where the indicator function / explicitly enforces that all a\'s are distinct from all /4's and they are district 



among themselves. Explicitly, 

k * 



/(a\...,a fc ) := 



n n ^-M=a\)) 

,i=ii,jeK(Ai) 



n n n ^-m=^) 

U.m^fc ieV s (A t ) j£V c (A m ) 



n n n ( i - i ^=o) 

.i,m<fci£Vi(Ai) jeV»(A m ) 



Multiplying out each parenthesis in the definition of /, we get a splitting of the form / = ^ Q I a (the sum 
ranges over a finite set which depends only on A). For each a, we may now estimate 



J2 / Q (a 1 ,...,a fe ) W (a\...,a fe )Zi 1 ...Z a \. ~< 



(A,) 



(9.14) 



8=1 



To see this, we note that picking the term 1(- • • ) from the parenthesis (1 — 1(- • • )) results in the merging 
of two vertices. Thus, the left-hand side of (9.14) is encoded by a graph A( Q ) obtained from A by merging 



vertices according to I a . Whenever two vertices are merged, we may lose two chain vertices, but gain a 
power M~ l from the chain weight (since if indices a and a' coincide, then one of the factors s a b and s a 'b' in 
the chain weight (see (5.1)) can be dropped from the weight and estimated by M _1 ). The associated loss of 



$ 2 is therefore compensated by M 1 ^ $ 2 . In order to gain from the chain vertices in the merged graph, 

^V(a')-MA') -< $dcg(A') $ c(A') (g 15 ) 



we invoke Lemma 9.4 to get 



for each subchain A' of Aw. Here ( |9.15 1 is applicable because the left-hand side of (9.14) factors into 
a product of expressions encoded by the subchains of A( Q ) (i.e. there are no summation constraints that 
involve two different subchains of A^ Q ^). This completes the proof of (9.14), and hence of (9.13). □ 



3 By union we mean that the chains Ai, . . . , may share external vertices but not summation vertices. 
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9.4. Removing Simplification (SI) and completion of the proof of Theorem 4.8 In this section we remove 
Simplification (SI) and put the arguments from Sections |6j [8j |9.1 9.2 and~9!3 together to complete the 
proof of Theorem |4.8| in full generality. 

The following tensorization property of weights plays a crucial role in this section. 



Lemma 9.6. Ifw'(a') and w"(a") are weights then so is w(a',a") 
Proof. The claim easily follows from Definition |4.4| 



w'(a')w"(a"). 



□ 



Recall that Simplification (SI) states that no index coincidences occur among the indices a when we 
compute the p-th power of Xp(A), i.e. in going from A to 7 P (A). In order to relax Simplification (SI), we go 
back to Section|6] In this section we add a tilde to the original summation indices in (6.5 1: a = (ai)iev a (7J'(A)) 
(we shall use a to denote the merged summation indices; see below). Let w(a) denote the product weight 
(see Lemma [9. 6[) in the p-fold copy of Xp(A). In general, if we do not assume Simplification (SI) then in 



(6.5) the original summation vertices a associated with different copies of A may coincide. As in the proof 



of Proposition |9.4| we split the summation using partitions by introducing the factor 



1 = £l(P(a)=P) 



into the right-hand side of (6.5 1. Here the summation ranges over partitions of V s (j p (A)). Thus we get 



a finite collection of terms indexed by partitions P, which we estimate individually (The combinatorics 
stemming from the number of partitions is independent of N and will be included in the irrelevant constant 
prefactors in the final estimate). 

Thus, for the sequel we choose and fix a partition P of V s (j p (A)). If two vertices of V s (j p (A)) are in the 
same block of P, we merge them and get a single vertex. Thus we get a new graph which we denote by 7 P (A). 
As before we have the splitting V{j P (A)) = V e (7 P (A))UV s (7 P (A)), where V e (7 P (A)) = V e {j p (A)) = V e (A) 
and V s (7p(A)) is given by the blocks of P. We use a = (aOieV^JKA)) to denote the summation indices of 
the graph 7 P (A). Each summation vertex of 7 P (A) is either unmerged or merged, depending on whether 
the associated block of P is of size one or greater than one. We have the trivial lift a = Lp(a) defined by 
<ii = di if I belongs to the block i of P. In merging two vertices i and j in V s ( , y p (A)), we lose in general all 
mechanisms that extract smallness (ingredients (b) and (c) in the list of the guiding principle of Section [5]) 
from them, including the linking associated with the possible factors Q ai or Q aj . On the other hand, we gain 
a factor M~ x from the reduction of the combinatorics of the summation. Generally, the reduced summation 
yields a factor M^ v " (7£(A))|-|v a ( 7 p (A))| _ More prec isely, 



J> P (a) < 1, wp(a) := w(L P (a))Afl^ P ( A »H^(A))| 



(9.16) 



This follows from (4.4 1 and the fact that w is a weight by Lemma 9.6 We stress that this is the only point 



where the assumption (4.4) is needed in our proof. 



Having fixed the merging of the vertices, we may now construct all graphs L 6 © pp (A); note that this 
set now depends on P. Here <8 P FP (A) is constructed using the same algorithm as <5 F (A) in Section pj In 
this case, however, each graph L G © PP (A) has the property that unmerged summation vertices of 7 P (A) 
which come with a Q have have been linked with an edge of F. There is no similar constraint for merged 
vertices. (The proof is the same as that for 25 p (A) in Section [6]) 

Now we may repeat the arguments of Sections [HJ |9.1[ |9.2| and |9.3| almost verbatim. The only difference 
is that we only gain from the unmerged vertices of L. For example, if i € V^ s (r) is unmerged and satisfies 
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i G n (F), then it must have been linked with an edge. Similarly, if i G V S (T) is unmerged and marked, it 
will give rise to a chain vertex after vertex resolution, and hence a factor <£>. 

In order to account for the gain from the merged original summation vertices of T, we interpret the 
estimate (9.16) as stating that each summation vertex i of 7 P (A) carries a factor Af -1 / 2 . This means if i 
is merged then we gain a factor M -1 / 2 over the unmerged scenario. (It is easy to see that this counting 
corresponds to the worst-case scenario where vertices of 7 P (A) were paired to get 7p(A). For example, if 
we have a weight w(a, b, c, d) with ^2 abcd w(a, b, c, d) = 1 and we merge a with b and c with d, then the new 
weight wp(a, c) = u>(a, a, c, c) will sum up to 

*^^wp(a,c) = '^^w(a,a,c,c) ^ M~ 2 

or n .. r 



by (4.4 1. The gain of order M 2 can then be distributed among the four vertices involved in the merging, 
each receiving a factor M" 1 / 2 .) This gain of M x l 2 compensates any possible gain associated with i, which 
is at best \E'$ (in the case where n(i) is marked and belongs to F). See the guiding principle in Section [5] 
The proof is then completed by the simple observation that M -1 / 2 ^ 



10. Proof of Theorem 4.15 



In this section we prove Theorem |4.15[ The proof relies on some ideas from the proof of Theorem \A.8\ but 
is considerably easier. The strategy is to resolve (using the Family B identities) the summation vertices 
(associated with indices a) using the partial expectation Ilap a anc ^ ^° estimate the resulting averaging 
using Theorem 4.8 Thus, unlike in the proof of Theorem |4.8[ there is no need to estimate high moments. 
Before giving the general proof, let us consider the simple example 

Tfl 2 / Tfl 2 \ 

FaG^aGa^ ^ >a ~F^2 G fj,aG a ^ -f~ P a [1 \G^ a G a ^ 

^ aa \ ^ aa J 

(a) 



l P a £ G$h xa h ay G$ + (* 3 



(a) 

m 2 Y,^G^G^ + 0^) 

X 

(a) 

m 2 ^2 SaxG^ x G xf , + (* 3 ) 

X 

0^(* 2 $), 



(3.13), and in the last step Theorem 4.8 (or Proposition 5.3) 



where in the second step we used (3.14a) and the bound A -< in the third step (2.1), in the fourth step 



The argument for a general graph A is similar. We have to gain a factor $ from each vertex i G V C (A) 
(in addition to the trivial deg(A) factors We use the terminology of Sections [6] - [9] without further 
comment. The proof consists of the following steps, which we merely sketch as they are almost identical to 
those of Sections [H] and [9] 
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(i) Make all entries of Z a (A) maximally expanded in a using the algorithm from the proof of Lemma 6.3 



The resulting linking yields a set of graphs ©(A) satisfying (recall the notation (8.3)) 



z a (A) = J2 A a (r) + o^ dc ^ A ^ v ^), 

reis(A) 

where all resolvent entries of Aa_(T) are maximally expanded in a. Each graph F £ ©(A) resulted from 
A by a finite number (possibly zero) of linking operations. In particular, V S (T) = V S (A). 

(ii) Let V m (r) C V C (A) denote those vertices of V C (A) that were not linked to in the process A H> F. (In 
other words, i £ V m (T) if and only if deg A (i) = deg r (i).) We have to gain a factor $ from each vertex 
i € V m (T); note that each i £ V C (A) \ V m (T) yields a factor "J due to the additional edge incident to i 
produced by the linking to i. 

Now we follow the vertex resolution of Sections [8l |9.1[ and |9.2| to the letter. The only difference is that 
the -A a (r) is not contained within a full expectation E but a partial expectation Ilaea^a instead. We 
resolve all vertices in V s (r), which yields the splitting 

z a (A) = Y, E EA,x(r) + o,(^(A) + i^(A)i )5 

ree(A) re£(K(r)) * 
where x £ {1, . . . N} Vf ^' denotes the fresh summation indices of T. 



hi) Exactly as in Sections |8.4| and 9.1 each vertex i £ V m (T) either carries an extra factor $ (if an error 



term of subleading order was chosen in the resolution of i) or gives rise to a fresh summation vertex 



j £ p 1 (i) that is a chain vertex of T. Hence we may invoke Theorem 4.8 for each fixed T £ £(5H(r)), 
to get 

(a) 

£-Aa,x(T) -< *<MA)$|Ve(A)| 
x 

This concludes the proof of Theorem |4.15| 



A. Basic resolvent bounds 



In this appendix we collect some useful tools about resolvents, and in particular prove Lemmas |3.8[ |3.9| and 



Proof of Lemma [3781 Let e > and D > be arbitrary. From (2.4 1 and (2.9) we find that there exists 
co,ci £ (0, e/2) and an event S such that 

A(z)l(E) s? N C0 ^(z) A^- Cl 



for all z £ S and large enough N, and P(^, c ) ^ N D . Thus we conclude using (3.11 1 that 

c(|l/G«(z)|l(5)) < C 



sup max ( 

zes 
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for large enough N. Using the first identity of (3.13) and (3.11) again, we find 

:(|l/Gi T) (z)|l(S)) < G. (A.l) 



max max 

\T\=ei,j$T 



(|G|f (z) - 6 ijm (z)\m) < CN^(z). 



sup max max 

zeS \T\=t i£T 



for I = 1. Using the first identity of (3.13) and (3.11 ), we may now proceed inductively on I = 1, 2, . . . , at 



each step proving ( A.l ) for t assuming it holds for I — 1. The result is 



sup max 



(\G ( ^\z)-6 lJ m(z)\l(E)) ^ C e N C0 *(z) ^ N e V(z) 



(A.2) 



for all z € S. This concludes the proof. □ 
PROOF of Lemma f3T9l The estimate (|3.18[) follows immediately from \G^ (E + irj)\ ^ r/ _1 and the defini- 



tion of S. 



above. First we deal with the high-probability event S. From (A.2) and (3.11) wc immediately get 

i/g\?\z) m) < c. 



In order to prove (|3.19|), we choose D := lOp and let S denote the event from the proof of Lemma 3.8 

(A.3) 



sup sup max I 



In order to handle the exceptional event we use Schur's formula (3.12 ). Then by Cauchy-Schwarz, (3.18 1, 



and (2.5), we find 
E 



l/Gl T) (z)-l(S c )l < 



E 



l/G { P(z) -1(S C ) 



2p 



1/2 



(E c y/ 2 ^ {C + N 3 ) p N- 



5p 



Combining (A.3) and (A.4| yields (3.19) 



(A.4) 



□ 



Proof of Lemma f9.ll To simplify notation, we set T = (the proof for nonempty T is the same). A 
simple large deviation estimate (see e.g. Lemmas B.l and B.2 in [15]) applied to 

W W 
Zi = ^(\h ik \ 2 ~ s ik )G%l+J2 h ikG$h H 

k k^l 

implies ^ $ '. 

As above, for the estimate of we set S = to simplify notation. Using (|2.3|) we write 



Ui = y^Sjfc(Gfcfe - m) = J~] SjkPkjGkk ~ m) + Si k Qk(Gkk - m) = ^ s ik P k (G k k -m) + Q^ (^ 2 ) , 



where the last step follows from Proposition |6.1[ Now we expand the inverse of (3.14c) using (2.7) to get 

G fefe - m = m 2 (-h kk + Z k + U { k k) ) + O^ 2 ) , 



77 



■ . u\ 

where we estimated the higher-order terms using (3.11) and the trivial bounds ha -< "J, U- -< ^, and 
-< $ (as proved in the previous paragraph). Using Pkhkk = and PkZk = we therefore get 



(fc) 



E ( E s « p * G « ) - m + °^ ( * 2) 

E s ikSki(G H - m) + C^(* 2 ) 
fc,i 

E^^+o^(* 2 ), 



where in the third step we used (2.2), (2.3), (2.9), and (3.13). Inverting the operator 1 — m 2 S therefore 
yields Ui -< g^> 2 . On the other hand, the estimate Ui -< is trivial. This concludes the proof. □ 



B. The coefficient q for band matrices 



In this section we prove an explicit bound for the coefficient g defined in (3.2), in the case that S is the 
variance matrix of a band matrix H, as defined in Example |2.1| In fact, we need only that the spectrum 
<j(S) of S is separated away from —1; that is this always true for band matrices is the content of the following 
lemma. 



Lemma B.l. Suppose that H is a d- dimensional band matrix from Example \2.1\ Then there is a constant 
5- > 0, depending only on the profile function f , such that o~(S) C [—1 + 5-, 1]. 



Proof. See 15 Lemma A. 1]. □ 



Proposition B.2. Let S be a doubly stochastic matrix satisfying o~(S) C [—1 + 8—, 1] for some 8— > 0. Then 
there is a universal constant C such that 

ClogJV 
mm{o_, [imm) } 



In particular, using Lemma B.l we find that (B.l ) holds for a d-dimensional band matrix from Example 2.1 



with a constant C depending only on the profile function f . 

The rest of this appendix is devoted to the proof of Proposition |B.2| A similar argument was given in 
the proof of [l5j Lemma 3.5]. The main difference is that here we do not assume the existence of a spectral 
gap near +1 in the spectrum of S. 

Proof of Proposition IB. 21 Abbreviate ( := m 2 and write 

1 1/2 



l-CS l-(l + CS)/2 
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We have the bound 



2 



max y 



1 + CS 
2 



^ 1. 



where we used that |C| ^ 1 as follows from (3.11 1. By the condition on the spectrum, we have 



l + CS 



^ max 
£ 2_i. £ 2 xe[-i+6-,i\ 



|1 + C*I 



max< 1 — 



6- ll + CI 



An elementary calculation yields 1 — |1 + C|/2 ^ c(Imm) 2 for some constant c > 0, from which we conclude 

l + CS 



1 — cmin{<5_, (Imm) } 



(B.2) 



for some small universal constant c > 0. For hq € N we therefore have 



1 



1-0S 



< E 

n=0 



l + CS 



^E 



i + CS 



n— no 



^ no + vN(l — cmin{(5_, (Imm) 2 }) 



C 



min{(5_, (Imm) 2 } 



Clog A 



min{(5_, (Imm) 2 } 

where the last step follows by taking jiq = Cq log N/ min{<5_, (Imm) 2 } for large enough Cq > 0. 



□ 



References 

[1] L. Erdos and A. Knowles, Quantum diffusion and derealization for band matrices with general distri- 
bution, Ann. H. Poincare 12 (2011), 1227-1319. 

[2] , Quantum diffusion and eigenfunction derealization in a random band matrix model, Comm. 

Math. Phys. 303 (2011), 509-554. 

[3] L. Erdos, A. Knowles, H.T. Yau, and J. Yin, Derealization and diffusion profile for random band 
matrices, Preprint jarXiv: 1205.5669 



[4] , The local semicircle law for a general class of random matrices, Preprint arXiv:1212.0164 



[5] , Spectral statistics of Erdos-Renyi graphs I: Local semicircle law, to appear in Ann. Prob. 

Preprint |arXiv:1103.T9T9l 

[6] , Spectral statistics of Erdos-Renyi graphs II: Eigenvalue spacing and the extreme eigenvalues, 

to appear in Comm. Math. Phys. Preprint |arXiv:1 103.3869 

[7] L. Erdos, S. Peche, J. A. Ramirez, B. Schlein, and H.T. Yau, Bulk universality for Wigner matrices, 
Comm. Pure Appl. Math. 63 (2010), 895-925. 



79 



[8] L. Erdos, J. Ramirez, B. Schlein, T. Tao, V. Vu, and H.T. Yau, Bulk universality for Wigner hermitian 
matrices with subexponential decay, Math. Res. Lett. 17 (2010), 667-674. 

[9] L. Erdos, J. Ramirez, B. Schlein, and H.T. Yau, Universality of sine-kernel for Wigner matrices with a 
small Gaussian perturbation, Electr. J. Prob. 15 (2010), 526-604. 

[10] L. Erdos, B. Schlein, and H.T. Yau, Local semicircle law and complete derealization for Wigner random 
matrices, Comm. Math. Phys. 287 (2009), 641-655. 

[11] , Semicircle law on short scales and derealization of eigenvectors for Wigner random matrices, 

Ann. Prob. 37 (2009), 815-852. 

[12] , Wegner estimate and level repulsion for Wigner random matrices, Int. Math. Res. Not. 2010 

(2009), 436-479. 

[13] , Universality of random matrices and local relaxation flow, Invent. Math. 185 (2011), no. 1, 

75-119. 

[14] L. Erdos, B. Schlein, H.T. Yau, and J. Yin, The local relaxation flow approach to universality of the 
local statistics of random matrices, Ann. Inst. Henri Poincare (B) 48 (2012), 1-46. 

[15] L. Erdos, H.T. Yau, and J. Yin, Bulk universality for generalized Wigner matrices, Preprint 
larXiv:1001. 34531 

[16] , Rigidity of eigenvalues of generalized Wigner matrices, to appear in Adv. Math. Preprint 

larXiv:1007.4652l 

[17] , Universality for generalized Wigner matrices with Bernoulli distribution, J. Combinatorics 1 

(2011), no. 2, 15-85. 



[18] N.S. Pillai and J. Yin, Universality of covariance matrices, Preprint arXiv:1110.2501 



80 



